I just noticed this:
df[df.condition1 & df.condition2] df[(df.condition1) & (df.condition2)]
Why is the output of these two lines different?
I cannot share the exact data, but I will try to provide as many details as possible:
df[df.col1 == False & df.col2.isnull()] # returns 33 rows and the rule `df.col2.isnull()` is not in effect df[(df.col1 == False) & (df.col2.isnull())] # returns 29 rows and both conditions are applied correctly
Thanks to @jezrael and @ayhan, here is what happened, and let me use the example provided by @jezael:
df = pd.DataFrame({'col1':[True, False, False, False], 'col2':[4, np.nan, np.nan, 1]}) print (df) col1 col2 0 True 4.0 1 False NaN 2 False NaN 3 False 1.0
If we look at line 3:
col1 col2 3 False 1.0
and the way I wrote the condition:
df.col1 == False & df.col2.isnull() # is equivalent to False == False & False
Since the & sign has higher precedence than == , without parentheses, False == False & False equivalent:
False == (False & False) print(False == (False & False)) # prints True
In brackets:
print((False == False) & False)
I think itβs a little easier to illustrate this problem with numbers:
print(5 == 5 & 1)
So, lessons: always add brackets !!!
I want me to be able to split the answer points on @jezrael and @ayhan :(