I assume that this needs to be sorted by "id" in order to fit correctly. Fortunately, this happens automatically when you install the key:
dat <-read.table(text="dte, id, val1, val2 2001-10-02, 1, 10, 25 2001-10-03, 1, 11, 24 2001-10-04, 1, 12, 23 2001-10-02, 2, 13, 22 2001-10-03, 2, 14, 21 ", header=TRUE, sep=",") dtb <- data.table(dat) setkey(dtb, "id") dtb[, residuals(lm(val1 ~ val2)), by="id"] #--------------- cbind(dtb, dtb[, residuals(lm(val1 ~ val2)), by="id"]) #--------------- dte id val1 val2 id.1 V1 [1,] 2001-10-02 1 10 25 1 1.631688e-15 [2,] 2001-10-03 1 11 24 1 -3.263376e-15 [3,] 2001-10-04 1 12 23 1 1.631688e-15 [4,] 2001-10-02 2 13 22 2 0.000000e+00 [5,] 2001-10-03 2 14 21 2 0.000000e+00 > dat <- data.frame(dte=Sys.Date()+1:1000000, id=sample(1:2, 1000000, repl=TRUE), val1=runif(1000000), val2=runif(1000000) ) > dtb <- data.table(dat) > setkey(dtb, "id") > system.time( cbind(dtb, dtb[, residuals(lm(val1 ~ val2)), by="id"]) ) user system elapsed 1.696 0.798 2.466 > system.time( dtb[,transform(.SD,r = residuals(lm(val1~val2))),by = "id"] ) user system elapsed 1.757 0.908 2.690
EDIT from Matthew : This is all correct for v1.8.0 on CRAN. With the slight addition that transform
in j
is the subject of the data.table wiki , clause 2: "For speed do not transform()
by group, cbind()
after". But :=
now works on a group in v1.8.1 and is simple and fast. See My answer for an illustration (but no need to vote for it).
Ok, I voted for it. Here is the console command to install v 1.8.1 on Mac (if you have the necessary Xcode avaialble tools, since they are only in the source):
install.packages("data.table", repos= "http://R-Forge.R-project.org", type="source", lib="/Library/Frameworks/R.framework/Versions/2.14/Resources/lib")
(For some reason, I was unable to get the Mac GUI package installer to read r-forge as a repository.)