Pandas update column with array

Question

Pandas update column with array

So, I am learning pandas and I have this problem.

Suppose I have a Dataframe:

ABC 1 x NaN 2 y NaN 3 x NaN 4 x NaN 5 y NaN

I am trying to create this:

 ABC 1 x [1,3,4] 2 y [2,5] 3 x [1,3,4] 4 x [1,3,4] 5 y [2,5]

Based on B.'s similarities

I have done this:

 teste = df.groupby(['B']) for name,group in teste: df.loc[df['B'] == name[0],'C'] = group['A'].tolist()

And I realized that. Like column C based on column A.

 ABC 1 x 1 2 y 2 3 x 3 4 x 4 5 y 5

Can someone explain to me why this is happening and the decision to do it the way I want? Thanks:)

+5

python pandas dataframe pandas-groupby

Artur barbosa Jul 19 '17 at 16:02

source share

4 answers

You can use transform to create lists.

 In [324]: df['C'] = df.groupby('B')['A'].transform(lambda x: [x.values]) In [325]: df Out[325]: ABC 0 1 x [1, 3, 4] 1 2 y [2, 5] 2 3 x [1, 3, 4] 3 4 x [1, 3, 4] 4 5 y [2, 5]

+4

Zero Jul 19 '17 at 16:24

source share

Sum is a creative thing!
Make A unique lists. Then do the conversion with sum .

 df.assign( C=pd.Series( df.A.values[:, None].tolist(), df.index ).groupby(df.B).transform('sum') ) ABC 0 1 x [1, 3, 4] 1 2 y [2, 5] 2 3 x [1, 3, 4] 3 4 x [1, 3, 4] 4 5 y [2, 5]

+1

piRSquared Jul 19 '17 at 16:45

source share

 test = df.groupby('B')['A'].apply(list)

0

Rakesh adhikesavan Jul 19 '17 at 16:11

source share

Psidom · Accepted Answer · 2017-07-19T16:08:46+0000

First, you can perform aggregation based on column B, and then join the original df on B :

 df # AB #0 1 x #1 2 y #2 3 x #3 4 x #4 5 y df.groupby('B').A.apply(list).rename('C').reset_index().merge(df) # BCA #0 x [1, 3, 4] 1 #1 x [1, 3, 4] 3 #2 x [1, 3, 4] 4 #3 y [2, 5] 2 #4 y [2, 5] 5

Pandas update column with array

More articles: