I would like to perform several aggregations using the data.table method lapply(.SD, ...), but my assumptions on how to do this in case of errors or equivalents rbind, not cbind.
For example, to get the average and average mpg in mtcars by cyl, you can do the following:
mtcars.dt <- data.table(mtcars)
mtcars.dt[, list(mpg.mean=mean(mpg), mpg.median=median(mpg)), by="cyl"]
cyl mpg.mean mpg.median
|1: 6 19.74 19.7
|2: 4 26.66 26.0
|3: 8 15.10 15.2
But applying the approach .SDeither performs the functions of:
mtcars.dt[, lapply(.SD, function(x) list(mean(x), median(x))),
by="cyl", .SDcols=c("mpg")]
cyl mpg
1: 6 19.7428571428571
2: 6 19.7
3: 4 26.6636363636364
4: 4 26
5: 8 15.1
6: 8 15.2
Or even breaks down:
mtcars.dt[, lapply(.SD, list(mean, median)),
by="cyl", .SDcols=c("mpg")]
Error in `[.data.table`(mtcars.dt, , lapply(.SD, list(mean, median)), :
attempt to apply non-function
EDIT: As Senor O noted, some answers provided work for my example, but only because there is one aggregation column. An ideal solution would work for multiple columns, for example, replacing the following:
mtcars.dt[, list(mpg.mean=mean(mpg), mpg.median=median(mpg),
hp.mean=mean(hp), hp.median=median(hp)), by="cyl"]
cyl mpg.mean mpg.median hp.mean hp.median
1: 6 19.74 19.7 122.29 110.0
2: 4 26.66 26.0 82.64 91.0
3: 8 15.10 15.2 209.21 192.5
, , . , - , , .SDcols AFAIK.