Grouping is automatically grouped by all non-numeric columns in pandas?

I have a sample dataset below (only showing the first rows of a row, but there are 193 rows):

country,beer_servings,spirit_servings,wine_servings,total_litres_of_pure_alcohol,continent
Afghanistan,0,0,0,0.0,Asia
Albania,89,132,54,4.9,Europe
Algeria,25,0,14,0.7,Africa
Andorra,245,138,312,12.4,Europe
Angola,217,57,45,5.9,Africa
Antigua & Barbuda,102,128,45,4.9,North America
...

When I ran this: drinks.groupby('continent').head()

I am returning a DataFrame with 30 rows. But in these 30 lines, I still have duplicate names for continent. For example, in the image below you can see what is Europerepeated twice (in lines 1 and 3):

enter image description here

I can’t understand why I still have two lines with one continent when I initially grouped by continent?

groupby country, groupby? SQL, max, min, sum .. , .

+4
2

!

, , head groupby , pd.DataFrame.head.

groupby head, .

, , 1 head ,

df.groupby('continent').head(1)

enter image description here

+3

drinks.groupby('continent').head([n=5]) n , . drinks.groupby('continent').head(1), , .

+2

Source: https://habr.com/ru/post/1673543/


All Articles