Pandas.resample () method - custom shortcut?

I am a bit stuck with using the .resample () method. I work with a DateFrame, where the indexes are TimeDate objects in the format YYYY-MM-DD, and I have a row of columns corresponding to the property value in several cities, as shown below:

State       California  Illinois    Pennsylvania    Arizona
RegionName  Los Angeles Chicago     Philadelphia    Phoenix
1/1/2000    204400      136800      52700           111000
2/1/2000    207000      138300      53100           111700
3/1/2000    209800      140100      53200           112800
4/1/2000    212300      141900      53400           113700
5/1/2000    214500      143700      53700           114300
6/1/2000    216600      145300      53800           115100
7/1/2000    219000      146700      53800           115600
8/1/2000    221100      147900      54100           115900
9/1/2000    222800      149000      54500           116500

When I apply the .resample () method to it to convert the display to a quarterly view, I get the data structure as follows:

hd = hd.resample('Q').mean()


State       New York    California  Illinois    Pennsylvania    Arizona
RegionName  New York    Los Angeles Chicago     Philadelphia    Phoenix
3/31/2000   NaN         207066.6667 138400      53000           111833.3333
6/30/2000   NaN         214466.6667 143633.3333 53633.33333     114366.6667
9/30/2000   NaN         220966.6667 147866.6667 54133.33333     116000

However, I need shortcuts for the newly created indexes, which will be displayed in a format reminiscent of the 2000q1 style, and not the last (or first) day of the quarter. I was on the .resample () method page in the pandas documentation, but for a living I can't figure out how to apply a custom shortcut. Can anybody help me?

Regards, Greem

+4
2

, to_period strftime:

#hd.index = pd.to_datetime(hd.index)
hd = hd.resample('Q').mean()
hd.index = hd.index.to_period('q').strftime('%Yq%q')
print (hd)
State       California Illinois Pennsylvania Arizona
RegionName Los Angeles  Chicago Philadelphia Phoenix
2000q1          207066   138400        53000  111833
2000q2          214466   143633        53633  114366
2000q3          220966   147866        54133  116000
+2

period to_period, groupby

df.index = pd.to_datetime(df.index)
df.set_index(df.index.to_period('Q')).groupby(level=0).mean()

State   California Illinois Pennsylvania Arizona
Region Los Angeles  Chicago Philadelphia Phoenix
2000Q1      207066   138400        53000  111833
2000Q2      214466   143633        53633  114366
2000Q3      220966   147866        54133  116000

strftime, @jezrael

df.groupby(pd.to_datetime(df.index).to_period().strftime('%Yq%q')).mean()

        California Illinois Pennsylvania Arizona
       Los Angeles  Chicago Philadelphia Phoenix
2000q1      207066   138400        53000  111833
2000q2      214466   143633        53633  114366
2000q3      220966   147866        54133  116000
+1

Source: https://habr.com/ru/post/1676536/


All Articles