I have a pandas data frame defined as follows:
last_4_weeks_range = pandas.date_range( start=datetime.datetime(2001, 5, 4), periods=28) last_4_weeks = pandas.DataFrame( [{'REST_KEY': 1, 'DLY_TRN_QT': 80, 'DLY_SLS_AMT': 90, 'COOP_DLY_TRN_QT': 30, 'COOP_DLY_SLS_AMT': 20}] * 28 + [{'REST_KEY': 2, 'DLY_TRN_QT': 70, 'DLY_SLS_AMT': 10, 'COOP_DLY_TRN_QT': 50, 'COOP_DLY_SLS_AMT': 20}] * 28, index=last_4_weeks_range.append(last_4_weeks_range)) last_4_weeks.sort(inplace=True)
and when I proceed to reconfigure it:
In [265]: last_4_weeks.resample('7D', how='sum') Out[265]: COOP_DLY_SLS_AMT COOP_DLY_TRN_QT DLY_SLS_AMT DLY_TRN_QT REST_KEY 2001-05-04 280 560 700 1050 21 2001-05-11 280 560 700 1050 21 2001-05-18 280 560 700 1050 21 2001-05-25 280 560 700 1050 21 2001-06-01 0 0 0 0 0
As a result, I get an extra empty box that I did not expect to see - 2001-06-01. I would not expect this bunker to be there, since my 28 days are evenly divided into the 7-day repeat sample that I am doing. I tried to communicate with private kwarg, but I can not avoid this extra garbage. Why does this extra bit appear when I have nothing to invest in it and how can I avoid creating it?
What I'm ultimately trying to do is get 7 day averages for REST_KEY, so
In [266]: last_4_weeks.groupby('REST_KEY').resample('7D', how='sum').mean(level=0) Out[266]: COOP_DLY_SLS_AMT COOP_DLY_TRN_QT DLY_SLS_AMT DLY_TRN_QT REST_KEY REST_KEY 1 112 168 504 448 5.6 2 112 280 56 392 11.2
but this extra pool throws my average value (for example, for COOP_DLY_SLS_AMT I have 112 that (20 * 7 * 4) / 5, not 140, which I got from (20 * 7 * 4) / 4, if I have there wasnβt this extra bean.) I also did not expect REST_KEY to appear in the aggregation, as this is part of the group, but this is really a smaller problem.
PS I am using pandas 0.11.0