I have the following data, where each row corresponds to a member of the household who makes a specific trip. Since we are talking about household members, these rows may have overlapping times, as shown by line 1 and line 2. The duration of the trip is indicated in minutes. IDX is simply just an index to make conversion back available.
IDX | ID | Trip | StartDateTime | Duration (in minutes)
1 | 1 | 1 | 2015-01-21 13:00 | 100
2 | 1 | 1 | 2015-01-21 13:00 | 184
3 | 1 | 1 | 2015-01-21 10:00 | 91
4 | 1 | 2 | 2015-01-22 13:00 | 30
5 | 2 | 2 | 2015-01-30 23:00 | 100
Now I would like to divide this data into each identifier, trip, day into hourly data as follows:
IDX | ID | Trip | StartDateTime | Duration (in minutes)
1 | 1 | 1 | 2015-01-21 13:00 | 60
1 | 1 | 1 | 2015-01-21 14:00 | 40
Note that the total duration of this group is still 100, similar to the first line. Secondly, IDX is taken from the first line. However, for the 4th row we have no more than 60 minutes, so that it will not be divided. Resulting:
IDX | ID | Trip | StartDateTime | Duration (in minutes)
4 | 1 | 2 | 2015-01-22 13:00 | 25
, !
:
IDX | ID | Trip | StartDateTime | Duration (in minutes)
5 | 2 | 2 | 2015-01-30 23:00 | 60
5 | 2 | 2 | 2015-01-31 0:00 | 40
?
:
library(data.table)
data.table(IDX = c(1:5),
ID = c(1,1,1,2,2),
Trip = c(1,1,1,1,2),
StartDateTime = strptime(c("2015-01-21 13:00","2015-01-21 13:00","2015-01-21 10:00","2015-01-22 13:00","2015-01-30 23:00"), format="%Y-%m-%d %H:%M"),
Duration = c(100,184,91,30,100)
)
13:12, , .
, , :
IDX | ID | Trip | StartDateTime | Duration (in minutes)
6 | 3 | 1 | 2015-01-30 23:14 | 67
:
IDX | ID | Trip | StartDateTime | Duration (in minutes)
6 | 3 | 1 | 2015-01-30 23:00 | 46
6 | 3 | 1 | 2015-01-31 0:00 | 11
, , , eddi.