Getting (index, column) pairs for True Boolean DataFrame elements in Pandas

Let's say I have a Pandas DataFrame, and I want to get a list of tuples of the form [(index1, column1), (index2, column2) ...] that describe the location of all the DataFrame elements where some condition is true. For instance:

x = pd.DataFrame(np.random.normal(0, 1, (4,4)), index=['a', 'b', 'c', 'd'], columns=['e', 'f', 'g', 'h']) x efgh a -1.342571 -0.274879 -0.903354 -1.458702 b -1.521502 -1.135800 -1.147913 1.829485 c -1.199857 0.458135 -1.993701 -0.878301 d 0.485599 0.286608 -0.436289 -0.390755 y = x > 0 

Is there any way to get:

 x.loc[y] 

To return:

 [(b, h), (c,f), (d, e), (d,f)] 

Or some equivalent? Obviously I can do:

 postup = [] for i in x.index: for j in x.columns: if x.loc[i, j] > 0: postup.append((i, j)) 

But I suppose that something better may be possible / already implemented. In Matlab, the find function in conjunction with sub2ind does the job.

+11
source share
3 answers
 x[x > 0].stack().index.tolist() 
+13
source

My approach uses MultiIndex :

 #make it a multi-indexed Series stacked = y.stack() #restrict to where it True true_stacked = stacked[stacked] #get index as a list of tuples result = true_stacked.index.tolist() 
+2
source

If one tuple is required for each row index:

 import pandas as pd import numpy as np df = pd.DataFrame(np.random.normal(0, 1, (4,4)), index=['a', 'b', 'c', 'd'], columns=['e', 'f', 'g', 'h']) # build column replacement column_dict = {} for col in [{col: {True: col}} for col in df.columns]: column_dict.update(col) # replace where > 0 df = (df>0).replace(to_replace=column_dict) # convert to tuples and drop 'False' values [tuple(y for y in x if y != False) for x in df.to_records()] 
+2
source

Source: https://habr.com/ru/post/978002/


All Articles