Groupby keep order among groups? Which way?

Question

Groupby keep order among groups? Which way?

When answering the question Sort a series of panda data by month name? we meet some strange groupby behavior.

 df = pd.DataFrame([["dec", 12], ["jan", 40], ["mar", 11], ["aug", 21], ["aug", 11], ["jan", 11], ["jan", 1]], columns=["Month", "Price"]) df["Month_dig"] = pd.to_datetime(df.Month, format='%b', errors='coerce').dt.month df.sort_values(by="Month_dig", inplace=True) # Now df looks like Month Price Month_dig 1 jan 40 1 5 jan 11 1 6 jan 1 1 2 mar 11 3 3 aug 21 8 4 aug 11 8 0 dec 12 12 total = (df.groupby(df['Month'])['Price'].mean()) print(total) # output Month aug 16.000000 dec 12.000000 jan 17.333333 mar 11.000000 Name: Price, dtype: float64

It appears that in total data is sorted alphabetically. While FP and I were expecting

 Month jan 17.333333 mar 11.000000 aug 16.000000 dec 12.000000 Name: Price, dtype: float64

What mechanism is behind groupby ? I know that this keeps order in each group from the documentation, but is there a rule for order among groups ? It seems to me that a fairly simple group order would be ["jan", "mar", "aug", "dec"], since the data in df sorted this way.

ps From ["aug", "dec", "jan", "mar"] it seems that the names of these groups are sorted in alphabetical order.
I am using Python 3.6 and pandas '0.20.3'

+3

python python-3.x pandas

Tai Dec 31 '17 at 17:34

source share

1 answer

Patrick haugh · Accepted Answer · 2017-12-31T17:42:48+0000

pandas.DataFrame.groupby has a sort argument which, by default, pandas.DataFrame.groupby True . Try

 total = (df.groupby(df['Month'], sort=False)['Price'].mean())

Groupby keep order among groups? Which way?

More articles: