I suspect that many people working with timeseries data have already encountered this problem, and pandas does not seem to provide a simple solution (for now!):
Let's pretend that:
- You have a time series of daily data with closed prices indexed by date (day).
- Today is June 19th. Last Close data value is 18JUN.
- You want to reprogram daily data in OHLC bars with a certain frequency (let M or 2M) end with 18JUN.
So, for M freq, the last point will be 19MAY-18JUN, the previous 19APR-18MAY and so on ...
ts.resample('M', how='ohlc')
will resample, but "M" is the period "end_of_month", so the result will give a full month for 2014-05 and a 2-week period for 2014-06, so your last score will not be a 'monthly bar'. This is not what we want!
With frequency 2M, given my samples, my test gives me the final touch, labeled 2014-07-31 (and the previous one labeled 2014-05-31), which is pretty misleading since there is no data on the JUL .... Estimated last The 2-month bar covers the last 2 weeks again.
The correct DatetimeIndex is easily created with:
pandas.date_range(end='2014-06-18', freq='2M', periods=300) + datetime.timedelta(days=18)
Documentation(Pandas prefers to do the same through
pandas.date_range(end='2014-06-18', freq='2M', periods=300) + pandas.tseries.offsets.DateOffset(days=18)
but my tests show that this method, although more "pandaïc" is 2 times slower!)
In any case, we cannot apply the correct DatetimeIndex to ts.resample ().
, pandas dev ( Pandas) , , OHLC , ?