How to iterate over rows of data and check if a value in a row of a NaN column

I have an initial question. I have a dataframe that I am repeating and want to check if the value in the column2 row is NaN or not to take action on this value if it is not NaN . My DataFrame is as follows:

 df: Column1 Column2 0 a hey 1 b NaN 2 c up 

I'm trying now:

 for item, frame in df['Column2'].iteritems(): if frame.notnull() == True: print 'frame' 

The idea is that I iterate over the rows in columns 2 and print frames for each row that has a value (which is a row). However, I get the following:

 AttributeError Traceback (most recent call last) <ipython-input-80-8b871a452417> in <module>() 1 for item, frame in df['Column2'].iteritems(): ----> 2 if frame.notnull() == True: 3 print 'frame' AttributeError: 'float' object has no attribute 'notnull' 

When I run only the first line of my code, I get

 0 hey 1 nan 2 up 

which indicates that the floats at the output of the first line are the cause of the error. Can someone tell me how I can accomplish what I want?

+5
source share
3 answers

As you already understood, frame in

 for item, frame in df['Column2'].iteritems(): 

- each row in the column, its type will be the type of elements in the column (which, most likely, would not be Series or DataFrame ). Therefore, frame.notnull() will not work on this.

Instead, you should try -

 for item, frame in df['Column2'].iteritems(): if pd.notnull(frame): print frame 
+4
source

try the following:

 df[df['Column2'].notnull()] 

The above code will provide you with data for which Column2 not null

+1
source

Using iteritems in a series (this is what you get when you take a column from a DataFrame), iterating over pairs (index, value). Thus, your item will take values ​​0, 1, and 2 in three iterations of the loop, and your frame will take the values 'hey' , NaN and 'up' (so "frame" is probably a bad name for it). The error comes from trying to use the notnull method on NaN (which is represented as a floating point number).

Instead, you can use the pd.notnull function:

 In [3]: pd.notnull(np.nan) Out[3]: False In [4]: pd.notnull('hey') Out[4]: True 

Another way would be to use notnull for the entire series, and then repeat these values ​​(which are now logical):

 for _, value in df['Column2'].notnull().iteritems(): if value: print 'frame' 
+1
source

Source: https://habr.com/ru/post/1233696/


All Articles