Receive dates from one data frame and filter data into another data frame

Question

Receive dates from one data frame and filter data into another data frame

I have two data frames,

user=c(rep('A',7),rep('B',8)) data = seq(1:15) date = as.Date(c('2016-01-01','2016-01-02','2016-01-03','2016-01-04','2016-01-05','2016-01-06','2016-01-07','2016-01-08','2016-01-09','2016-01-10','2016-01-11','2016-01-12','2016-01-13','2016-01-14','2016-01-15')) df = data.frame(user,date,data) df user date data 1 A 2016-01-01 1 2 A 2016-01-02 2 3 A 2016-01-03 3 4 A 2016-01-04 4 5 A 2016-01-05 5 6 A 2016-01-06 6 7 A 2016-01-07 7 8 B 2016-01-08 8 9 B 2016-01-09 9 10 B 2016-01-10 10 11 B 2016-01-11 11 12 B 2016-01-12 12 13 B 2016-01-13 13 14 B 2016-01-14 14 15 B 2016-01-15 15

and

 df1 =data.frame(user = c('A','B'), start_date = as.Date(c('2016-01-02','2016-01-10')), end_date = as.Date(c('2016-01-06','2016-01-14'))) > df1 user start_date end_date 1 A 2016-01-02 2016-01-06 2 B 2016-01-10 2016-01-14

I want to take the start date and end date from df1 and filter the entries in the df dataframe date column. Data for a specific user should only be between start_date and end_date from df1. The resulting data file should have the following output:

 user date data A 2016-01-02 2 A 2016-01-03 3 A 2016-01-04 4 A 2016-01-05 5 A 2016-01-06 6 B 2016-01-10 10 B 2016-01-11 11 B 2016-01-12 12 B 2016-01-13 13 B 2016-01-14 14

I tried the following,

Quoting through each user, transferring it to the data frame. Then we filter it again using start_date and end_date of the corresponding record in df1, and then add it to the new data frame. This takes a very long time for me, as the data is very large. Is there a more efficient way to do this?

thanks

+3

r dplyr

haimen Mar 23 '16 at 23:52

source share

2 answers

With the recently implemented non equi join function in data.table v1.9.8 +, this can be done as follows:

 require(data.table) # v1.9.8+ setDT(df)[df1, .(user,date,data), on=.(user, date>=start_date, date<=end_date)]

+3

Arun Jun 25 '16 at 0:58

source share

adaien · Accepted Answer · 2016-03-23T23:59:06+0000

 library(dplyr) df<-left_join(df,df1,by="user") df <- df %>% filter(date>=start_date & date<=end_date)

Receive dates from one data frame and filter data into another data frame

More articles: