Column summation in pandas frame

Question

Column summation in pandas frame

Is there a way to make the sum in columns after grouping in a pandas data frame? For example, I have the following data frame:

ID   W_1       W_2     W_3 
1    0.1       0.2     0.3
1    0.2       0.4     0.5
2    0.3       0.3     0.2
2    0.1       0.3     0.4
2    0.2       0.0     0.5
1    0.5       0.3     0.2
1    0.4       0.2     0.1

I want to have an extra column called "my_sum" that sums the first row in all columns (W_1, W_2, W_3). The result will be something like this:

ID   W_1       W_2     W_3     my_sum
1    0.1       0.2     0.3      0.6
1    0.2       0.4     0.5      1.1
2    0.3       0.3     0.2      0.8
2    0.1       0.3     0.4      0.8
2    0.2       0.0     0.5      0.7
1    0.5       0.3     0.2      1.0
1    0.4       0.2     0.1      0.7

I have done the following:

df['my_sum'] =   df.groupby('ID')['W_1','W_1','W_1'].transform(sum,axis=1)

but this sums all the records only W_1. the documentation mentions axis transfer, but I'm not sure why it is not efficient.

I looked through this question , as well as this one , but they are different from what I want.

+4

python pandas dataframe

owise Aug 29 '17 at 21:13

source share

4 answers

In [7]: df['my_sum'] = df.drop('ID',1).sum(axis=1)

In [8]: df
Out[8]:
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

In [9]: df['my_sum'] = df.filter(regex='^W_\d+').sum(axis=1)

In [10]: df
Out[10]:
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

+4

MaxU 29 . '17 21:16

, . axis=1 .

, , . . , "" .

@MaxU . .

df.assign(
    my_sum=np.column_stack([df[c].values for c in df if c.startswith('W_')]).sum(1)
)

   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

Or if it's really simple ['W_1', 'W_2', 'W_3']

df.assign(my_sum=df[['W_1', 'W_2', 'W_3']].sum(1))

   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

+4

piRSquared Aug 29 '17 at 21:24

source share

In addition, you can pass the list to a data frame by specifying which columns to summarize. This is useful because columns can easily be placed in list form.

sum_list = ['W_1', 'W_2', 'W_3']
df['my_sum'] = df[sum_list].sum(1)

0

sameagol Dec 10 '18 at 23:16

source share

coldspeed · Accepted Answer · 2017-08-29T21:25:47+0000

, , .sum(1). .

`df.blocks`

df['my_sum'] = df.blocks['float64'].sum(1)

:

`df.select_dtypes`

df['my_sum'] = df.select_dtypes(float).sum(1)
df
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

`df.iloc`

df['my_sum'] = df.iloc[:, 1:].sum(1)
df
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

BI -

.

df['my_sum'] = df[df < 1].sum(1)
df
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

`numpy.sum`

, numpy:

df['my_sum'] = df.values[:, 1:].sum(1)
df
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

`df.columns.str.contains`

df.iloc[:, df.columns.str.contains('W_')].sum(1)
df
   ID  W_1  W_2  W_3  my_sum
0   1  0.1  0.2  0.3     0.6
1   1  0.2  0.4  0.5     1.1
2   2  0.3  0.3  0.2     0.8
3   2  0.1  0.3  0.4     0.8
4   2  0.2  0.0  0.5     0.7
5   1  0.5  0.3  0.2     1.0
6   1  0.4  0.2  0.1     0.7

Column summation in pandas frame

df.blocks

df.select_dtypes

df.iloc

BI -

numpy.sum

df.columns.str.contains

More articles:

`df.blocks`

`df.select_dtypes`

`df.iloc`

`numpy.sum`

`df.columns.str.contains`