I have a dataframe like this:
Index STNAME COUNTY COUNTY_POP 0 AL 0 100 1 AL 1 150 2 AL 3 200 3 AL 5 50 ... 15 CA 0 300 16 CA 1 200 17 CA 3 250 18 CA 4 350
I want to summarize the three largest integers from COUNTY_POP for each state. So far I have had:
In[]: df.groupby(['STNAME'])['COUNTY_POP'].nlargest(3) Out[]: Index STNAME COUNTY COUNTY_POP 0 AL 0 100 1 AL 1 150 2 AL 3 200 ... 15 CA 0 300 17 CA 3 250 18 CA 4 350
However, when I add the .sum () operation to the above code, I get the following output.
In[]: df.groupby(['STNAME'])['COUNTY_POP'].nlargest(3).sum() Out[]: 1350
I am relatively new to Python and Pandas. If anyone could explain what causes this and how to fix it, I would really appreciate it!
source share