I have a dataframe that looks like this:
Month Fruit Sales 1 Apple 45 1 Bananas 12 3 Apple 6 1 Kiwi 34 12 Melon 12
I am trying to get a dataframe that looks like this
Fruit Sales (month=1) Sales (month=2) Apple 55 65 Bananas 12 102 Kiwi 54 78 Melon 132 43
I have now
df=df.groupby(['Fruit']).agg({'Sales':np.sum}).reset_index()
There must be some way to filter the arguments in agg () based on the "Month" variable. I just could not find it in the docs. Any help?
Edit: Thanks for the solution. To complicate the situation, I would like to summarize another column. Example:
Month Fruit Sales Revenue 1 Apple 45 45 1 Bananas 12 12 3 Apple 6 6 1 Kiwi 34 34 12 Melon 12 12
Preferred output will be similar to
Sales Revenue Fruit 1 3 12 1 3 12 0 Apple 61 6 0 61 6 0 1 Bananas 12 6 0 12 6 0 2 Kiwi 34 0 0 34 0 0 3 Melon 0 0 12 0 0 12
I managed to get this with df.pivot_table(values=['Sales','Revenue'], index='Fruit', columns=['Month'], aggfunc='np.sum').reset_index() , so my problem is resolved.
I tried to do the same with df.groupby(['Fruit', 'Month'])['Sales','Revenue'].sum().unstack('Month', fill_value=0).rename_axis(None, 1).reset_index() , but this raises a TypeError. Is it possible to perform the above operation using groupby ?