Roll connection gives NA in data.table

data

 Usage <- structure(list(feature = c("M11", "M11", "M11", "M11", "M11", "M11", "M11"), 
                       startDate = structure(c(17130, 17130, 17130, 17130,17155, 17155, 17155), class = "Date"), 
                       cc = c("X6", "X6", "X6", "X6", "X6", "X6", "X6"), vendor = c("Z1", "Z1", "Z1", "Z1", "Z1","Z1", "Z1")), .Names = c("feature", "startDate", "cc", "vendor"), 
                  row.names = c(NA,-7L), class = c("data.table",  "data.frame"))


 Limits <- structure(list(vendorId = c("Z1", "Z1", "Z1", "Z1", "Z1", "Z1"), 
                       featureId = c("M11", "M11", "M11", "M11", "M11", "M11"), 
                       costcenter = c("X6", "X6", "X6", "X6", "X6", "X6"), 
                       oldLimit = c(1L,2L, 3L, 4L, 5L, 6L), date = structure(c(17135,  17105, 17074, 17044, 17149, 17119), class = "Date")), 
                  .Names = c("vendorId", "featureId","costcenter",     "oldLimit",  "date"), row.names = c(NA, -6L), class = "data.frame")

  setDT(Usage) 
  setDT(Limits)

I am trying to add the "limit" column to the "Usage" dt by looking at the "Limits" dt. This is to find out what was the limit for this combination of "function", "costCenter", "vendor" during its corresponding use.

However, when I try to do a roll-join using the code below, I get strange results. I get a lot of NS for my data, so I created data samples as described above. Below is the code for my reconnect.

Usage[Limits, limitAtStartDate:= i.oldLimit,   on=c(cc="costcenter",feature="featureId",
                                  vendor="vendorId", startDate="date" ), roll=T,verbose=T] 

> Usage
   feature  startDate cc vendor limitAtStartDate
1:     M11 2016-11-25 X6     Z1                6
2:     M11 2016-11-25 X6     Z1               NA
3:     M11 2016-11-25 X6     Z1               NA
4:     M11 2016-11-25 X6     Z1               NA
5:     M11 2016-12-20 X6     Z1                5
6:     M11 2016-12-20 X6     Z1               NA
7:     M11 2016-12-20 X6     Z1               NA

Why are "5" and "6" set for only one entry for "limitAtStartDate"?

I expect 5 for all rows with the date 2016-12-20 and 6 for all 2016-11-25. Please let me know where I am going wrong. I am using data.table version 1.10.0.

+4
1

X[Y] join data.table , , Y X. , Y. Limits Usage 7. , , , , Limits

Limits[Usage, 
       oldLimit, 
       on = .(costcenter = cc, featureId = feature, vendorId = vendor, date = startDate),
       roll = TRUE]
## [1] 6 6 6 6 5 5 5

, ( ) findInterval.

setorder(Limits, date)[findInterval(Usage$startDate, date), oldLimit]
## [1] 6 6 6 6 5 5 5

, ,

  • .
  • , data.table (, roll = 2 roll = TRUE)
  • , , , ( ), data.table
+2

Source: https://habr.com/ru/post/1669241/


All Articles