Python Pandas read_excel does not recognize null cell

My excel sheet:

   A   B  
1 first second
2
3 
4  x   y  
5  z   j

Python Code:

df = pd.read_excel (filename, parse_cols=1)

return the correct result:

  first second
0 NaN   NaN
1 NaN   NaN
2 x     y
3 z     j

If I want to work only with the second column

df = pd.read_excel (filename, parse_cols=[1])

Return:

 second
0  y
1  j

I will have information on empty excel rows (NaN in my df), even if I only work with a specific column. If you display information about NaN, this is not normal, for example, for sciprows paramater, etc.

thank

+4
source share
1 answer

The parameter works for me skip_blank_lines=False:

df = pd.read_excel ('test.xlsx', 
                     parse_cols=1, 
                     skip_blank_lines=False)
print (df)

       A       B
0  first  second
1    NaN     NaN
2    NaN     NaN
3      x       y
4      z       j

Or, if you need to omit the first line:

df = pd.read_excel ('test.xlsx', 
                     parse_cols=1, 
                     skiprows=1,
                     skip_blank_lines=False)
print (df)

  first second
0   NaN    NaN
1   NaN    NaN
2     x      y
3     z      j
+5
source

Source: https://habr.com/ru/post/1653701/


All Articles