Pandas: Filtering Multiple Conditions

I am trying to do boolean indexing using several conditions using Pandas. My original DataFrame is called df . If I follow below, I get the expected result:

 temp = df[df["bin"] == 3] temp = temp[(~temp["Def"])] temp = temp[temp["days since"] > 7] temp.head() 

However, if I do this (which I think should be equivalent), I will not return the lines:

 temp2 = df[df["bin"] == 3] temp2 = temp2[~temp2["Def"] & temp2["days since"] > 7] temp2.head() 

Any idea what explains the difference?

+17
source share
1 answer

Use () because operator priority :

 temp2 = df[~df["Def"] & (df["days since"] > 7) & (df["bin"] == 3)] 

Or create conditions on separate lines:

 cond1 = df["bin"] == 3 cond2 = df["days since"] > 7 cond3 = ~df["Def"] temp2 = df[cond1 & cond2 & cond3] 

Sample :

 df = pd.DataFrame({'Def':[True] *2 + [False]*4, 'days since':[7,8,9,14,2,13], 'bin':[1,3,5,3,3,3]}) print (df) Def bin days since 0 True 1 7 1 True 3 8 2 False 5 9 3 False 3 14 4 False 3 2 5 False 3 13 temp2 = df[~df["Def"] & (df["days since"] > 7) & (df["bin"] == 3)] print (temp2) Def bin days since 3 False 3 14 5 False 3 13 
+23
source

Source: https://habr.com/ru/post/1275572/


All Articles