Reading a text file with Pandas, where do some lines have blank elements?

Question

Reading a text file with Pandas, where do some lines have blank elements?

I have a dataset in a text file that looks like this.

0 0CF00400 X 8 66 7D 91 6E 22 03 0F 7D 0.021650 R 0 18EA0080 X 3 E9 FE 00 0.022550 R 0 00000003 X 8 D5 64 22 E1 FF FF FF F0 0.023120 R

I read it using

 file_pandas = pd.read_csv(fileName, delim_whitespace = True, header = None, engine = 'python')

And got a conclusion

  0 0 0CF00400 X 8 66 7D 91 6E 22 03 0F 7D 0.02165 1 0 18EA0080 X 3 E9 FE 0 0.022550 R None None None NaN 2 0 00000003 X 8 D5 64 22 E1 FF FF FF F0 0.02312

But I want it read as

  0 0 0CF00400 X 8 66 7D 91 6E 22 03 0F 7D 0.021650 R 1 0 18EA0080 X 3 E9 FE 00 0.022550 R 2 0 00000003 X 8 D5 64 22 E1 FF FF FF F0 0.023120 R

I tried deleting delim_whitespace = True and replacing it with delimiter = " " , but just merged the first four columns in the output shown above, but it analyzed the rest of the data correctly, which means that the rest of the columns were similar to the original txt file (ban NaN values in spaces).

I am not sure how to proceed from here.

Side note: 00 only parsed as 0 . Is there a way to display 00 instead?

+5

python pandas

Aditya salapaka Oct 19 '16 at 15:15

source share

1 answer

Psidom · Accepted Answer · 2016-10-19T15:26:28+0000

It seems that your data is fixed width columns, you can try pandas.read_fwf() :

 from io import StringIO import pandas as pd df = pd.read_fwf(StringIO("""0 0CF00400 X 8 66 7D 91 6E 22 03 0F 7D 0.021650 R 0 18EA0080 X 3 E9 FE 00 0.022550 R 0 00000003 X 8 D5 64 22 E1 FF FF FF F0 0.023120 R"""), header = None, widths = [1,12,2,8,4,4,4,4,4,4,4,4,16,2])

Reading a text file with Pandas, where do some lines have blank elements?

More articles: