Pandas single column aggregate

Question

Pandas single column aggregate

I have a Pandas dataframe:

test=pd.DataFrame(columns=['GroupID','Sample','SampleMeta','Value'])
test.loc[0,:]='1','S1','S1_meta',1
test.loc[1,:]='1','S1','S1_meta',1
test.loc[2,:]='2','S2','S2_meta',1

I want to (1) group by two columns ('GroupID' and 'Sample'), (2) sum 'Value' for each group and (3) store only unique values in 'SampleMeta' for each group. The desired result is displayed ('GroupID' and 'Sample' as an index):

                SampleMeta  Value
GroupID Sample                       
1       S1      S1_meta      2
2       S2      S2_meta      1

df.groupby () and the .sum () method get closer, but .sum () combines the same values in the Values column within the group. As a result, the value of "S1_meta" is duplicated.

g=test.groupby(['GroupID','Sample'])
print g.sum()

                SampleMeta      Value
GroupID Sample                       
1       S1      S1_metaS1_meta  2
2       S2      S2_meta         1

Is there a way to achieve the desired result using groupby () and related methods? Merging the summed “value” for each group with a separate “SampleMeta” DataFrame works, but there should be a more elegant solution.

+1

python pandas

lmart999 13 '14 22:06

1

Karl D. · Accepted Answer · 2014-05-13T23:50:43+0000

, SampleMeta groupby:

print test.groupby(['GroupID','Sample','SampleMeta']).sum()

                           Value
GroupID Sample SampleMeta       
1       S1     S1_meta         2
2       S2     S2_meta         1

, SampleMeta , , :

print test.groupby(['GroupID','Sample','SampleMeta']).sum().reset_index(level=2)

               SampleMeta  Value
GroupID Sample                  
1       S1        S1_meta      2
2       S2        S2_meta      1

, SampleMeta ['GroupID','Sample'] . , ['GroupID','Sample'] , , , SampleMeta groupby/sum :

print test.groupby(['GroupID','Sample'])['Value'].sum()

GroupID  Sample
1        S1        2
2        S2        1

Pandas single column aggregate

More articles: