Pandas sort by month by index

Dec 47 Nov 36 Oct 14 Sep 2 Jan 2 Aug 2 May 1 Apr 1 Jun 1 Jul 1 Feb 1 Name: date, dtype: int64 

I am trying to sort the above series whose index column is month, month. However, instead of sorting by the calendar order of the month, the sorting function is sorted by the dictionary order of the month name. How can I sort the data correctly? I think I need to indicate that the index type is a month, not a string. Any help is appreciated. The following is a snippet of code.

 import calendar movies = release_dates[release_dates.title.str.contains('Christmas') & (release_dates.country=='USA')] movies = movies.date.dt.month.apply(lambda x: calendar.month_abbr[x]) counts = movies.value_counts() counts 
+1
source share
3 answers

You can use sorted CategoricalIndex with sort_index :

 df.index = pd.CategoricalIndex(df.index, categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'], sorted=True) df = df.sort_index() print (df) date Jan 2 Feb 1 Apr 1 May 1 Jun 1 Jul 1 Aug 2 Sep 2 Oct 14 Nov 36 Dec 47 
+3
source

Well, that was not very difficult. I am sure that Categorical would only work with the fact that I could not solve the problem using Categorical. What I've done -

  • Sort by month, while months are represented as integers.
  • The result series uses index matching to convert an integer month to a shortened string

I am sure there are more efficient ways to solve this problem, so if you have a better way, send the same.

  import calendar months = release_dates[release_dates.title.str.contains('Christmas') & (release_dates.country=='USA')].date.dt.month counts = months.value_counts() counts.sort_index(inplace=True) counts.index = map(lambda x: calendar.month_abbr[x], counts.index) counts.plot.bar() 
0
source

Adding to @ jezrael's very useful answer:

In pandas 0.25.1, sorted was replaced by ordered with pandas.CategoricalIndex

Old way:

 df.index = pd.CategoricalIndex(df.index, categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'], sorted=True) df = df.sort_index() 

Error

Error
 --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-468-3f0ab66734d4> in <module> 2 net.index = pd.CategoricalIndex(net.index, 3 categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'], ----> 4 sorted=True) 5 net = net.sort_index() 6 net TypeError: __new__() got an unexpected keyword argument 'sorted' 

New way:

 df.index = pd.CategoricalIndex(df.index, categories=['Jan', 'Feb', 'Mar', 'Apr','May','Jun', 'Jul', 'Aug','Sep', 'Oct', 'Nov', 'Dec'], ordered=True) df = df.sort_index() 
0
source

Source: https://habr.com/ru/post/1274415/


All Articles