I want to reprogram an indexed data frame with a date using the start date, end date and "drill down"
Let's say I have this data framework:
value
00:00, 01/05/2017 2
12:00, 01/05/2017 4
00:00, 02/05/2017 6
12:00, 02/05/2017 8
00:00, 03/05/2017 10
12:00, 03/05/2017 12
And I want to redo it in order to go from 06:00, 01/05/2017to
18:00 02/05/2017with a “graininess” of 12 hours (this is the same as the original here for simplicity, but not necessarily). As a result, I want:
value
06:00, 01/05/2017 3
18:00, 01/05/2017 5
06:00, 02/05/2017 7
18:00, 02/05/2017 9
Note that the values are the average values that they overlap (e.g. 3 = average (2.4))
I am not sure how to do this.
My first attempt:
def resample(df: DataFrame, start: datetime, end: datetime, granularity: timedelta) -> DataFrame:
result = df.resample(granularity).mean()
result = result[result.index <= end]
result = result[result.index >= start]
return result
This correctly digitizes the data frame and provides the correct granularity, but does not align the results with the start date, so the result:
value
12:00, 01/05/2017 4
00:00, 02/05/2017 6
12:00, 02/05/2017 8
base :
def resample(df: DataFrame, start: datetime, end: datetime, desired_granularity: timedelta) -> DataFrame:
data_before_start = df[df.index <= start]
last_date_before_start = data_before_start.last_valid_index()
current_granularity_secs = seconds_between_measurements(df)
rule = str(int(desired_granularity.total_seconds())) + 'S'
base = current_granularity_secs - (start - last_date_before_start).total_seconds()
result = df.resample(rule, base=base).mean()
result = result[result.index < end]
result = result[result.index >= start]
return result
:
value
06:00, 01/05/2017 4
18:00, 01/05/2017 6
06:00, 02/05/2017 8
18:00, 02/05/2017 10
, , .
- , , ?
, - :)
EDIT:
- , , , pad(). "" , ()