Numpy sort weirdly when sorting by pandas DataFrame

When I do data[genres].sum() , I get the following result

 Action 1891 Adult 9 Adventure 1313 Animation 314 Biography 394 Comedy 3922 Crime 1867 Drama 5697 Family 754 Fantasy 916 Film-Noir 40 History 358 Horror 1215 Music 371 Musical 260 Mystery 1009 News 1 Reality-TV 1 Romance 2441 Sci-Fi 897 Sport 288 Thriller 2832 War 512 Western 235 dtype: int64 

But when I try to sort the amount using np.sort

 genre_count = np.sort(data[genres].sum())[::-1] pd.DataFrame({'Genre Count': genre_count})` 

I get the following result

 `Out[19]: Genre Count 0 5697 1 3922 2 2832 3 2441 4 1891 5 1867 6 1313 7 1215 8 1009 9 916 10 897 11 754 12 512 13 394 14 371 15 358 16 314 17 288 18 260 19 235 20 40 21 9 22 1 23 1 

The expected result should look like this:

 Genre Count Drama 5697 Comedy 3922 Thriller 2832 Romance 2441 Action 1891 Crime 1867 Adventure 1313 Horror 1215 Mystery 1009 Fantasy 916 Sci-Fi 897 Family 754 War 512 Biography 394 Music 371 History 358 Animation 314 Sport 288 Musical 260 Western 235 Film-Noir 40 Adult 9 News 1 Reality-TV 1 

Numpy seems to ignore the genre column.

Can someone help me figure out what I'm wrong about?

+6
source share
2 answers

data[genres].sum() returns a series. The column of the genre is actually not a column - it is an index.

np.sort just looks at the DataFrame or Series values, not at the index, and returns a new NumPy array with the sorted data[genres].sum() values. Index information is lost.

The way to sort data[genres].sum() and save the index information is to do something like:

 genre_count = data[genres].sum() genre_count.sort(ascending=False) # in-place sort of genre_count, high to low 

Then you can rotate the sorted genre_count series back into the DataFrame if you want:

 pd.DataFrame({'Genre Count': genre_count}) 
+5
source

data[genres].sum() returns a series.

And if you use version 0.2 of pandas, the command will change a little.

  genre_count = data[genres].sum() genre_count.sort_values(ascending=False)' 

You can find the link to the pandas documentation

0
source

Source: https://habr.com/ru/post/983819/


All Articles