I have a pandas dataframe that looks something like this:
TIMESTAMP EVENT_COUNT 0 2014-07-23 04:28:23 1 1 2014-07-23 04:28:24 1 2 2014-07-23 04:28:25.999000 4 3 2014-07-23 04:28:27 1 4 2014-07-23 04:28:28.999000 2 5 2014-07-23 04:28:30 1 6 2014-07-23 04:29:31 7 7 2014-07-23 04:29:33 1 8 2014-07-23 04:29:34 1 9 2014-07-23 04:29:36 1 10 2014-07-23 04:40:37 2 11 2014-07-23 04:40:39 1 12 2014-07-23 04:40:40 1 13 2014-07-23 04:40:42 1 14 2014-07-23 04:40:43 1 15 2014-07-23 04:40:44.999000 4 16 2014-07-23 04:41:46 1 17 2014-07-23 04:41:47 1 18 2014-07-23 04:41:49 1 19 2014-07-23 04:41:50 1 20 2014-07-23 04:50:52 9 21 2014-07-23 04:50:53 4 22 2014-07-23 04:50:55 6 23 2014-07-27 01:12:13 1
My ultimate goal is to find gaps in this that exceed 5 minutes. So, from above, I would find a gap between:
2014-07-23 04:29:36 and 2014-07-23 04:40:37 2014-07-23 04:41:50 and 2014-07-23 04:50:52 2014-07-23 04:50:55 and 2014-07-27 01:12:13
No gaps need to be identified in less than 5 minutes. Thus, the following will not be recognized as a “space”.
2014-07-23 04:28:30 and 2014-07-23 04:29:31 (Only 61 seconds) 2014-07-23 04:40:37 and 2014-07-23 04:40:39 (Only 2 seconds) 2014-07-23 04:40:44.999000 and 2014-07-23 04:41:46 (Just over 61 seconds)
How can I find the flaws mentioned above? When I tried the solution mentioned in this, nothing changed. I used the following command:
df.reindex(pd.date_range(min(df['TIMESTAMP']), max(df['TIMESTAMP']), freq='5min')).fillna(0)
After that, this command looks the same.
source share