The sum of each two columns in a Pandas dataframe

When I use Pandas, I have a problem. My task is this:

df=pd.DataFrame([(1,2,3,4,5,6),(1,2,3,4,5,6),(1,2,3,4,5,6)],columns=['a','b','c','d','e','f'])
Out:
    a b c d e f
0   1 2 3 4 5 6
1   1 2 3 4 5 6 
2   1 2 3 4 5 6

what I want to do is that the output frame looks like this:

Out:
    s1   s2   s3
0   3    7    11
1   3    7    11
2   3    7    11

In other words, sum the columns (a, b), (c, d), (e, f) separately and rename the column names of the results as (s1, s2, s3). Can anyone help solve this problem in Pandas? Thank you very much.

+4
source share
1 answer

1) Run the groupbywrt columns by providing axis=1. According to @Boud's comment, you are exactly getting what you want, with a little tweaking in the grouping array:

df.groupby((np.arange(len(df.columns)) // 2) + 1, axis=1).sum().add_prefix('s')

enter image description here

Grouping is performed in accordance with this condition:

np.arange(len(df.columns)) // 2
# array([0, 0, 1, 1, 2, 2], dtype=int32)

2) np.add.reduceat, :

df = pd.DataFrame(np.add.reduceat(df.values, np.arange(len(df.columns))[::2], axis=1))
df.columns = df.columns + 1
df.add_prefix('s')

enter image description here

:

DF 1 , 20 :

from string import ascii_lowercase
np.random.seed(42)
df = pd.DataFrame(np.random.randint(0, 10, (10**6,20)), columns=list(ascii_lowercase[:20]))
df.shape
(1000000, 20)

def with_groupby(df):
    return df.groupby((np.arange(len(df.columns)) // 2) + 1, axis=1).sum().add_prefix('s')

def with_reduceat(df):
    df = pd.DataFrame(np.add.reduceat(df.values, np.arange(len(df.columns))[::2], axis=1))
    df.columns = df.columns + 1
    return df.add_prefix('s')

# test whether they give the same o/p
with_groupby(df).equals(with_groupby(df))
True

%timeit with_groupby(df.copy())
1 loop, best of 3: 1.11 s per loop

%timeit with_reduceat(df.copy())   # <--- (>3X faster)
1 loop, best of 3: 345 ms per loop
+7

Source: https://habr.com/ru/post/1661087/


All Articles