Pandas shift data column by date

I have a panel dataset that is indexed by Date and ID and looks something like this:

df = pd.DataFrame({'Date':['2005-12-31', '2006-03-31', '2006-09-30','2005-12-31', '2006-03-31', '2006-06-30', '2006-09-30'],
              'ID':[1,1,1,2,2,2,2],
              'Value':[14,25,34,23,67,14,46]})

I'm trying to transfer the values โ€‹โ€‹of the same ID by date and date, maybe a continuous quarter. groupby.shift is not giving me the right thing or maybe I'm missing something. Here is what I did:

df['pre_value'] = df.groupby('ID')['Value'].shift(1)

This shifts the values โ€‹โ€‹of the same identifier, but ignores the date ... note that for is ID==1absent 2006-06-30, and therefore there must really be NaN pre_valuefor its own 2006-09-30. I also consider multi-indexing or declaring a dataset as panels, but this complicates my other calculations. Is there an easy way to do this with a dataframe?

+4
source share
1 answer

I would just make a copy of the data frame, shift Dateby 1 (it seems you want a quarter shift), and then go back to the original framework. To shift the date, you can convert the string dates to the pandas period, so changing it will be easier.

In [34]: df['Date'] = pd.PeriodIndex(df['Date'], freq='Q')

In [35]: df2 = df.copy()

In [36]: df2['Date'] += 1

In [37]: df.merge(df2, on=['Date','ID'], suffixes=('', '_lag1'), how='left')
Out[37]:
    Date  ID  Value  Value_lag1
0 2005Q4   1     14         NaN
1 2006Q1   1     25          14
2 2006Q3   1     34         NaN
3 2005Q4   2     23         NaN
4 2006Q1   2     67          23
5 2006Q2   2     14          67
6 2006Q3   2     46          14
+2
source

Source: https://habr.com/ru/post/1623601/


All Articles