Pandas select the rows where the query is in the tuple column

I have a dataframe in which one column contains tuples:

df = pd.DataFrame({'a':[1,2, 3], 'b':[(1,2), (3,4), (0,4)]}) ab 0 1 (1, 2) 1 2 (3, 4) 2 3 (0, 4) 

I would like to select the rows in which the element I provide is in the tuple.

For example, return the lines where 4 is in the tuple, expect the result to be as follows:

  ab 1 2 (3, 4) 2 3 (0, 4) 

I tried:

 print(df[df['b'].isin([4])] 

But this returns an empty framework:

 Empty DataFrame Columns: [a, b] Index: [] 
+6
source share
2 answers

You need to apply with in :

 print(df[df['b'].apply(lambda x: 4 in x)]) ab 1 2 (3, 4) 2 3 (0, 4) 
+1
source

You can convert tuples to sets first and then find intersection sets:

 In [27]: df[df['b'].map(set) & {4}] Out[27]: ab 1 2 (3, 4) 2 3 (0, 4) 

it will also work for multiple values ​​- for example, if you are looking for all rows where either 1 or 3 is in the tuple:

 In [29]: df[df['b'].map(set) & {1, 3}] Out[29]: ab 0 1 (1, 2) 1 2 (3, 4) 

Explanation:

 In [30]: df['b'].map(set) Out[30]: 0 {1, 2} 1 {3, 4} 2 {0, 4} Name: b, dtype: object In [31]: df['b'].map(set) & {1, 3} Out[31]: 0 True 1 True 2 False Name: b, dtype: bool 
0
source

Source: https://habr.com/ru/post/1015590/


All Articles