This question is similar but not identical. Add multiple columns to R data.table in one function call?
Say I have data.table
ex<-data.table(AAA=runif(100000),BBBB=runif(100000),CCC=runif(100000),DDD=runif(100000),EEE=runif(100000),FFF=runif(100000),HHH=runif(100000),III=runif(100000),FLAG=c(rep(c("a","b","c","d","e"),200000)))
I can get the sum and value of all columns by doing
ex[,c(sum=lapply(.SD,sum),mean=lapply(.SD,mean)),by=FLAG]
The results look good with the names given in J added to existing column names, to simplify identification with only 1 row for each of the FLAG values, as expected.
However, let's say I have a function that returns a list, such as
sk<-function(x){ meanx<-mean(x) lenx<-length(x) difxmean<-x-meanx m4<-sum((difxmean)^4)/lenx m3<-sum((difxmean)^3)/lenx m2<-sum((difxmean)^2)/lenx list(mean=meanx,len=lenx,sd=m2^.5,skew=m3/m2^(3/2),kurt=(m4/m2^2)-3) }
If i do
ex[,lapply(.SD,sk),by=FLAG]
I get results with a string for each listing output. I would like to have only 1 row of results with columns for each of the source columns and function results.
For example, output columns should be
AAA.mean AAA.len AAA.sd AAA.skew AAA.kurt BBBB.mean BBBB.len BBBB.sd BBBB.skew BBBB.kurt .... III.mean III.len III.sd III.skew III.kurt
Is there any way to do this?
I know that I can just put all these separate functions in J and get the columns, but I find that when I use this function instead of separate functions for all moments, it will be a bit faster.
x<-runif(10000000) system.time({ mean(x) length(x) sd(x) skewness(x) kurtosis(x) }) user system elapsed 5.84 0.47 6.30 system.time(sk(x)) user system elapsed 3.9 0.1 4.0