Python Pandas: check if any DataFrame column satisfies the condition

I have a DataFrame with lots of columns. Now I have a condition that checks some of these columns if any of these columns is nonzero. The code looks ugly, and I wonder if there is a more elegant way to apply this condition to a subset of columns? My current code is:

df['indicator'] = ( (df['col_1'] != 0) | (df['col_2'] != 0) | (df['col_3'] != 0) | (df['col_4'] != 0) | (df['col_5'] != 0) ) 

I was looking for something like this pseudo code:

 columns = ['col_1', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5'] df['indicator'] = df.any(columns, lambda value: value != 0) 
+5
source share
3 answers

ne is a form of the method != . I use this so that any pipelining looks better. I use any(axis=1) to find if the lines in the line are true.

 df['indicator'] = df[columns].ne(0).any(axis=1) 
+6
source

In this particular case, you can also check if the sum of the corresponding columns matches !=0 :

 df['indicator'] = df[columns].prod(axis=1).ne(0) 

PS @piRSquared's solution is much more general ...

+2
source

Perhaps using min

 df['indicator']=(df[columns]!=0).min(axis=1).astype(bool) 
+2
source

Source: https://habr.com/ru/post/1276249/


All Articles