Pandas single column aggregate

I have a Pandas dataframe:

test=pd.DataFrame(columns=['GroupID','Sample','SampleMeta','Value'])
test.loc[0,:]='1','S1','S1_meta',1
test.loc[1,:]='1','S1','S1_meta',1
test.loc[2,:]='2','S2','S2_meta',1

I want to (1) group by two columns ('GroupID' and 'Sample'), (2) sum 'Value' for each group and (3) store only unique values ​​in 'SampleMeta' for each group. The desired result is displayed ('GroupID' and 'Sample' as an index):

                SampleMeta  Value
GroupID Sample                       
1       S1      S1_meta      2
2       S2      S2_meta      1 

df.groupby () and the .sum () method get closer, but .sum () combines the same values ​​in the Values ​​column within the group. As a result, the value of "S1_meta" is duplicated.

g=test.groupby(['GroupID','Sample'])
print g.sum()

                SampleMeta      Value
GroupID Sample                       
1       S1      S1_metaS1_meta  2
2       S2      S2_meta         1 

Is there a way to achieve the desired result using groupby () and related methods? Merging the summed “value” for each group with a separate “SampleMeta” DataFrame works, but there should be a more elegant solution.

+1
1

, SampleMeta groupby:

print test.groupby(['GroupID','Sample','SampleMeta']).sum()

                           Value
GroupID Sample SampleMeta       
1       S1     S1_meta         2
2       S2     S2_meta         1

, SampleMeta , , :

print test.groupby(['GroupID','Sample','SampleMeta']).sum().reset_index(level=2)

               SampleMeta  Value
GroupID Sample                  
1       S1        S1_meta      2
2       S2        S2_meta      1

, SampleMeta ['GroupID','Sample'] . , ['GroupID','Sample'] , , , SampleMeta groupby/sum :

print test.groupby(['GroupID','Sample'])['Value'].sum()

GroupID  Sample
1        S1        2
2        S2        1
0

Source: https://habr.com/ru/post/1684708/


All Articles