Suppose we have a list showing the amount of each object on a specific date (mm-dd-yyyy-hour-minute):
A = [ [ ['07-07-2012-21-04', 'orange', 1], ['08-16-2012-08-57', 'orange', 1], ['08-18-2012-03-30', 'orange', 1], ['08-18-2012-03-30', 'orange', 1], ['08-19-2012-03-58', 'orange', 1], ['08-19-2012-03-58', 'orange', 1], ['08-19-2012-04-09', 'orange', 1], ['08-19-2012-04-09', 'orange', 1], ['08-19-2012-05-21', 'orange', 1], ['08-19-2012-05-21', 'orange', 1], ['08-19-2012-06-03', 'orange', 1], ['08-19-2012-07-51', 'orange', 1], ['08-19-2012-08-17', 'orange', 1], ['08-19-2012-08-17', 'orange', 1] ], [ ['07-07-2012-21-04', 'banana', 1] ], [ ['07-07-2012-21-04', 'mango', 1], ['08-16-2012-08-57', 'mango', 1], ['08-18-2012-03-30', 'mango', 1], ['08-18-2012-03-30', 'mango', 1], ['08-19-2012-03-58', 'mango', 1], ['08-19-2012-03-58', 'mango', 1], ['08-19-2012-04-09', 'mango', 1], ['08-19-2012-04-09', 'mango', 1], ['08-19-2012-05-21', 'mango', 1], ['08-19-2012-05-21', 'mango', 1], ['08-19-2012-06-03', 'mango', 1], ['08-19-2012-07-51', 'mango', 1], ['08-19-2012-08-17', 'mango', 1], ['08-19-2012-08-17', 'mango', 1] ]
]
I need to do in A to fill in all the missing dates (from the minimum date to the maximum date A) for each object with a value as 0. When the missing dates and their corresponding values โโ(0) are turned on, I want to sum the values โโfor each date so that no the date did not repeat - for each sublist.
Now, I'm trying to do the following: I split the dates and values โโseparately (in lists with the names u and v) and convert each sublist to a series of pandas and allocate the corresponding indexes to them. Therefore, for each zip (u, v):
def generate(values, indices): indices = flatten(indices) date_index = DatetimeIndex(indices) ts = Series(values, index=date_index) ts.reindex(date_range(min(date_index), max(date_index))) return ts
But here redefinition raises an exception. What I'm looking for is a purely pythonic way (without pandas) that is completely based on list comprehension, or perhaps even in numpy arrays.
There is another problem of aggregation for several hours, which means that if all the dates are the same, and only the hours are different, I want to fill in all the missing hours of the day, and then repeat the same aggregation process for each hour, with missing hours, filled with 0 values.
Thanks in advance.