Getting (index, column) pairs for True Boolean DataFrame elements in Pandas

Question

Getting (index, column) pairs for True Boolean DataFrame elements in Pandas

Let's say I have a Pandas DataFrame, and I want to get a list of tuples of the form [(index1, column1), (index2, column2) ...] that describe the location of all the DataFrame elements where some condition is true. For instance:

x = pd.DataFrame(np.random.normal(0, 1, (4,4)), index=['a', 'b', 'c', 'd'], columns=['e', 'f', 'g', 'h']) x efgh a -1.342571 -0.274879 -0.903354 -1.458702 b -1.521502 -1.135800 -1.147913 1.829485 c -1.199857 0.458135 -1.993701 -0.878301 d 0.485599 0.286608 -0.436289 -0.390755 y = x > 0

Is there any way to get:

 x.loc[y]

To return:

 [(b, h), (c,f), (d, e), (d,f)]

Or some equivalent? Obviously I can do:

 postup = [] for i in x.index: for j in x.columns: if x.loc[i, j] > 0: postup.append((i, j))

But I suppose that something better may be possible / already implemented. In Matlab, the find function in conjunction with sub2ind does the job.

+11

python pandas

dylkot Nov 10 '14 at 22:21

source share

3 answers

My approach uses MultiIndex :

 #make it a multi-indexed Series stacked = y.stack() #restrict to where it True true_stacked = stacked[stacked] #get index as a list of tuples result = true_stacked.index.tolist()

+2

exp1orer Nov 10 '14 at 23:36

source share

If one tuple is required for each row index:

 import pandas as pd import numpy as np df = pd.DataFrame(np.random.normal(0, 1, (4,4)), index=['a', 'b', 'c', 'd'], columns=['e', 'f', 'g', 'h']) # build column replacement column_dict = {} for col in [{col: {True: col}} for col in df.columns]: column_dict.update(col) # replace where > 0 df = (df>0).replace(to_replace=column_dict) # convert to tuples and drop 'False' values [tuple(y for y in x if y != False) for x in df.to_records()]

+2

allen-smithee Nov 11 '14 at 0:02

source share

A. Coady · Accepted Answer · 2014-11-11T06:26:21+0000

 x[x > 0].stack().index.tolist()

Getting (index, column) pairs for True Boolean DataFrame elements in Pandas

More articles: