How to get a moving average for recent months in Pandas

Question

How to get a moving average for recent months in Pandas

I have a dataset with the first column: the column "Date" and "Second" is the price. Date is a trading day.

I want to return the table as shown below:

If the date is every month since 2006, the MA price is the average price for the last N months. (N = [1,2,3,4,5,6])

So, for example: if I want N = 1 on January 1, 2006, Ma should be the average price from December last year. If N = 2 Ma, there should be an average price from November and December last year.

I read some solution on removing month from datetime and groupby. But I don’t know how to bring them together.

+5

python pandas datetime

Dylan Aug 22 '17 at 19:56

source share

3 answers

Wen · Answer 1 · 2017-08-22T20:53:56+0000

Or just try

df.sort_index(ascending=False).rolling(5).mean().sort_index(ascending=True)

For your additional question

 index=pd.date_range(start="4th of July 2017",periods=30,freq="D") df=pd.DataFrame(np.random.randint(0,100,30),index=index) df['Month']=df.index df.Month=df.Month.astype(str).str[0:7] df.groupby('Month')[0].mean() Out[162]: Month 2017-07 47.178571 2017-08 56.000000 Name: 0, dtype: float64

EDIT 3: Invalid value, sliding on average for two months

 index=pd.date_range(start="4th of July 2017",periods=300,freq="D") df=pd.DataFrame(np.random.randint(0,100,300),index=index) df['Month']=df.index df.Month=df.Month.astype(str).str[0:7] df=df.groupby('Month')[0].agg({'sum':'sum','count':'count'}) df['sum'].rolling(2).sum()/df['count'].rolling(2).sum() Out[200]: Month 2017-07 NaN 2017-08 43.932203 2017-09 45.295082 2017-10 46.967213 2017-11 46.327869 2017-12 49.081967 #etc

2Obe · Answer 2 · 2017-08-22T20:46:40+0000

Returns the average rental value for the number of periods specified by the window parameter. For instance. window = 1 will return the original list. Window = 2 will calculate the average over 2 days and so on.

 index=pd.date_range(start="4th of July 2017",periods=30,freq="D") df=pd.DataFrame(np.random.randint(0,100,30),index=index) print([pd.rolling_mean(df,window=i,freq="D") for i in range(1,5)])

.....

 2017-07-04 NaN 2017-07-05 20.5 2017-07-06 64.5 2017-07-07 58.5 2017-07-08 13.0 2017-07-09 4.5 2017-07-10 17.5 2017-07-11 23.5 2017-07-12 40.5 2017-07-13 60.0 2017-07-14 73.0 2017-07-15 90.0 2017-07-16 56.5 2017-07-17 55.0 2017-07-18 57.0 2017-07-19 45.0 2017-07-20 77.0 2017-07-21 46.5 2017-07-22 3.5 2017-07-23 48.5 2017-07-24 71.5 2017-07-25 52.0 2017-07-26 56.5 2017-07-27 47.5 2017-07-28 64.0 2017-07-29 82.0 2017-07-30 68.0 2017-07-31 72.5 2017-08-01 58.5 2017-08-02 67.0

.....

Next, you can reset the NA values using the df dropna method, for example:

 df.rolling(window=2,freq="D").mean().dropna() #Here you must adjust the window size

So, all the code that should print you a moving average for months:

 print([df.rolling(i,freq="m").mean().dropna() for i in range(len(df.rolling(window=1,freq="m").sum()))])

Yanfei W. · Answer 3 · 2017-08-22T21:09:47+0000

First set Date as an index:

price_df.set_index('Date', inplace=True)
price_df.index = pd.to_datetime(price_df.index)

Then calculate the moving average for the past N months:
mv = price_df.rolling(window=i*30, center=False).mean().dropna() for N=i

Finally, return the subset only from the first day of each month (if this is what you want to return):
mv.ix[mv.index.day==1]

How to get a moving average for recent months in Pandas

More articles: