Pandas: create columns with the same size and sum after a group of several columns

I have a dataframe where I do groupby on 3 columns and aggregate the sum and size of the numeric columns. After running the code

df = pd.DataFrame.groupby(['year','cntry', 'state']).agg(['size','sum'])

I get something like below:

Image datafram

Now I want to separate the subcommands of my size from the main columns and create a column of only one size, but I want to keep the sum columns under the headings of the main columns. I tried different approaches, but did not succeed. These are the methods I tried, but can't make me work for me:

How to count the number of rows in a group in a pandas group by object?

Convert pandas GroupBy to DataFrame

We would be grateful if anyone could help me with this.

Hi,

+3
2

d1 = pd.DataFrame(dict(
        year=np.random.choice((2014, 2015, 2016), 100),
        cntry=['United States' for _ in range(100)],
        State=np.random.choice(states, 100),
        Col1=np.random.randint(0, 20, 100),
        Col2=np.random.randint(0, 20, 100),
        Col3=np.random.randint(0, 20, 100),
    ))

df = d1.groupby(['year', 'cntry', 'State']).agg(['size', 'sum'])
df

enter image description here



- size groupby

d1.groupby(['year', 'cntry', 'State']).size()

year  cntry          State        
2014  United States  California       10
                     Florida           9
                     Massachusetts     8
                     Minnesota         5
2015  United States  California        9
                     Florida           7
                     Massachusetts     4
                     Minnesota        11
2016  United States  California        8
                     Florida           8
                     Massachusetts    11
                     Minnesota        10
dtype: int64

df

df.xs('size', axis=1, level=1)

enter image description here

, size . size ['Col1', 'Col2', 'Col3'],

df[('Col1', 'size')]

year  cntry          State        
2014  United States  California       10
                     Florida           9
                     Massachusetts     8
                     Minnesota         5
2015  United States  California        9
                     Florida           7
                     Massachusetts     4
                     Minnesota        11
2016  United States  California        8
                     Florida           8
                     Massachusetts    11
                     Minnesota        10
Name: (Col1, size), dtype: int64

1

pd.concat([df[('Col1', 'size')].rename('size'),
           df.xs('sum', axis=1, level=1)], axis=1)

enter image description here


2

pd.concat([df[('Col1', 'size')].rename(('', 'size')),
           df.xs('sum', axis=1, level=1, drop_level=False)], axis=1)

enter image description here

+4

piRSquared , , , , , .

:

group = df.groupby(['year', 'cntry','state']).agg(['sum','size'])
mi = pd.MultiIndex.from_product([['Col1','Col2','Col3'],['sum']])
group = group.reindex_axis(mi,axis=1)
sizes = df.groupby('state').size().values
group['Tot'] = 0
group.columns = group.columns.set_levels(['sum','size'], level=1)
group.Tot.size = sizes

:

                 Col1 Col2 Col3  Tot
                  sum  sum  sum size
year cntry State
2015 US    CA      20    0    4    1
           FL      40    3    5    1
           MASS     8    1    3    1
           MN      12    2    3    1
+2

Source: https://habr.com/ru/post/1684713/


All Articles