Boolean with groupby in pandas

I would like to use pandas.groupby certain way. Given a DataFrame with two boolean columns (call them col1 and col2 ) and an id column, I want to add a column as follows:

for each record, if ( col2 - True), and ( col1 is True for any of the records with the same identifier), then assign True. Otherwise, False.

I made a simple example:

 df = pd.DataFrame([[0,1,1,2,2,3,3],[False, False, False, False, False, False, True],[False, True, False, False, True ,True, False]]).transpose() df.columns = ['id', 'col1', 'col2'] 

gives the following DataFrame :

  id col1 col2 0 0 False False 1 1 False True 2 1 False False 3 2 False False 4 2 False True 5 3 False True 6 3 True False 

In accordance with the above rule, add the following column:

 0 False 1 False 2 False 3 False 4 False 5 True 6 False 

Any ideas on an elegant way to do this?

+5
source share
2 answers
 df.groupby('id').col1.transform('any') & df.col2 0 False 1 False 2 False 3 False 4 False 5 True 6 False dtype: bool 
+5
source

This code will produce the requested result:

 df2 = df.merge(df.groupby('id')['col1'] # group on "id" and select 'col1' .any() # True if any items are True .rename('cond2') # name Series 'cond2' .to_frame() # make a dataframe for merging .reset_index()) # reset_index to get id column back print(df2.col2 & df2.cond2) # True when 'col2' and 'cond2' are True 
+3
source

Source: https://habr.com/ru/post/1265761/


All Articles