Python Pandas: check if any DataFrame column satisfies the condition

Question

Python Pandas: check if any DataFrame column satisfies the condition

I have a DataFrame with lots of columns. Now I have a condition that checks some of these columns if any of these columns is nonzero. The code looks ugly, and I wonder if there is a more elegant way to apply this condition to a subset of columns? My current code is:

df['indicator'] = ( (df['col_1'] != 0) | (df['col_2'] != 0) | (df['col_3'] != 0) | (df['col_4'] != 0) | (df['col_5'] != 0) )

I was looking for something like this pseudo code:

 columns = ['col_1', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5'] df['indicator'] = df.any(columns, lambda value: value != 0)

+5

python pandas dataframe

Matthias Apr 4 '18 at 13:33

source share

3 answers

In this particular case, you can also check if the sum of the corresponding columns matches !=0 :

 df['indicator'] = df[columns].prod(axis=1).ne(0)

PS @piRSquared's solution is much more general ...

+2

Maxu Apr 4 '18 at 13:44

source share

Perhaps using min

 df['indicator']=(df[columns]!=0).min(axis=1).astype(bool)

+2

Wen Apr 4 '18 at 13:52

source share

piRSquared · Accepted Answer · 2018-04-04T13:34:36+0000

ne is a form of the method != . I use this so that any pipelining looks better. I use any(axis=1) to find if the lines in the line are true.

 df['indicator'] = df[columns].ne(0).any(axis=1)

Python Pandas: check if any DataFrame column satisfies the condition

More articles: