Equivalent to ddply (..., transform, ...) in data.table

I have the following code using ddply from the plyr package:

 ddply(mtcars,.(cyl),transform,freq=length(cyl)) 

Version of this data table:

 DT<-data.table(mtcars) DT[,freq:=.N,by=cyl] 

How can this be expanded if I have several functions, such as below?

Now I want to execute more than one function on ddply and data.table :

 ddply(mtcars,.(cyl),transform,freq=length(cyl),sum=sum(mpg)) DT[,list(freq=.N,sum=sum(mpg)),by=cyl] 

But, data.table gives me only three columns: cyl, frequency and sum. Well, I can do like this:

 DT[,list(freq=.N,sum=sum(mpg),mpg,disp,hp,drat,wt,qsec,vs,am,gear,carb),by=cyl] 

But I have a large number of variables in my read data, and I want all of them to be there, as in ddply(...transform....) . Is there a shortcut in data.table in the same way as := when we have only one function (as indicated above) or something like this paste(names(mtcars),collapse=",") inside data.table ? Note. I also have a large number of functions to run. Therefore, I cannot repeat =: several times (but I would prefer it if lapply can be applied lapply ).

+6
source share
2 answers

Use backquoted := like this ...

 DT[ , `:=`( freq = .N , sum = sum(mpg) ) , by=cyl ] head( DT , 3 ) # mpg cyl disp hp drat wt qsec vs am gear carb freq sum #1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 7 138.2 #2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 7 138.2 #3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 11 293.3 
+10
source

Also useful in some situations:

 newvars <- c("freq","sum") DT[, `:=`(eval(newvars), list(.N,sum(mpg)))] 
+2
source

Source: https://habr.com/ru/post/956655/


All Articles