I need to have a data frame with events and display the start, end and run counts, where runs are found where events are less than a certain period of time.
Data.frame rows are already sorted by time
eg.
library(lubridate)
ts <- c("2016-10-28 19:21:19",
"2016-10-28 19:21:20",
"2016-10-28 19:21:21",
"2016-10-28 19:21:21",
"2016-10-28 19:23:23",
"2016-10-28 19:23:24",
"2016-10-28 19:23:24",
"2016-10-28 19:23:25",
"2016-10-30 03:59:09",
"2016-10-30 08:54:31",
"2016-10-30 08:54:35"
)
df <- data.frame(time=ymd_hms(ts))
What I would like to receive is a data frame where this interval is 60 from the previous event
start end count
2016-10-28 19:21:19 2016-10-28 19:21:21 4
2016-10-28 19:23:23 2016-10-28 19:23:25 4
2016-10-30 03:59:09 2016-10-30 03:59:09 1
2016-10-30 08:54:31 2016-10-30 08:54:35 2
Actual sequences will be very long, so the solution should work well with large (~ 100k) strings
I looked at lag, diffand other functions, but can not see the simple and effective way to do it.
source
share