The pandas group join returns the result in a DataFrame

I have a DataFrame that looks like this ...

idn value 0 ID1 25 1 ID1 30 2 ID2 30 3 ID2 50 

I want to add another column to this frame, which is the maximum value grouped by 'idn'

I want to get a result similar to this.

  idn value max_val 0 ID1 25 30 1 ID1 30 30 2 ID2 30 50 3 ID2 50 50 

I can extract the maximum value of 'value' using a group, for example ...

 df[['idn', 'value']].groupby('idn')['value'].max() 

However, I cannot merge this result back into the original DataFrame.

What is the best way to get the desired result?

thanks

+6
source share
2 answers

Use the transform method for the groupby object:

 In [5]: df['maxval'] = df.groupby(by=['idn']).transform('max') In [6]: df Out[6]: idn value maxval 0 ID1 25 30 1 ID1 30 30 2 ID2 30 50 3 ID2 50 50 
+6
source

set the df index to idn and then use df.merge . after merge, reset index and rename columns

 dfmax = df.groupby('idn')['value'].max() df.set_index('idn', inplace=True) df = df.merge(dfmax, how='outer', left_index=True, right_index=True) df.reset_index(inplace=True) df.columns = ['idn', 'value', 'max_value'] 
+1
source

Source: https://habr.com/ru/post/985155/


All Articles