Pandas query string list

I have a pandas data frame and want to return the rows from the data frame corresponding to the client identifiers that appear in the list of target identifiers.

For example, if my data frame looks like this:

id Name ... ... ------------------------- 1 Bob ... ... 2 Dave ... ... 2 Dave ... ... 3 Phil ... ... 4 Rick ... ... 4 Rick ... ... 

Basically, I want to return rows for clients that appear more than once in this data frame. Therefore, I want to return all identifiers that occur more than once.

 id Name ... ... ------------------------- 2 Dave ... ... 2 Dave ... ... 4 Rick ... ... 4 Rick ... ... 

I can get a list of identifiers by doing the following

 grouped_ids = df.groupby('id').size() id_list = grouped_ids[grouped_ids>1].index.tolist() 

And now I would like to go back to the data frame and return all the rows corresponding to these identifiers in the list.

Is it possible?

Thanks for the help.

+1
source share
1 answer

I think you are looking for isin() :

 In [1]: import pandas as pd In [2]: df = pd.DataFrame({'customer_id':range(5), 'A':('a', 'b', 'c', 'd', 'e')}) In [3]: df Out[3]: A customer_id 0 a 0 1 b 1 2 c 2 3 d 3 4 e 4 In [4]: df[df.customer_id.isin((1,3))] Out[4]: A customer_id 1 b 1 3 d 3 

[edit] To match this target list, simply use it as an argument to the isin() method:

 In [5]: mylist = (1,3) In [6]: df[df.customer_id.isin(mylist)] Out[6]: A customer_id 1 abcde 1 3 abcde 3 
+1
source

Source: https://habr.com/ru/post/1237079/


All Articles