You can use a temporary variable and use it again for other variables:
DT[, c("cumsum", "cumsumofcumsum"):={ x <- cumsum(val) list(x, cumsum(x)) }, by=id]
Of course, you can use dplyr and use your data table as a backend, but I'm not sure that you will get the same performance as the pure data.table method:
library(dplyr) DT %>% group_by(id ) %>% mutate( cum1 = cumsum(val), cum2 = cumsum(cum1) )
EDIT add some benches:
The clean data.table solution is 5 times faster than dplyr. I think the view in dplyr behind the scenes may explain this difference.
f_dt <- function(){ DT[, c("cumsum", "cumsumofcumsum"):={ x <- as.numeric(cumsum(val)) list(x, cumsum(x)) }, by=id] } f_dplyr <- function(){ DT %>% group_by(id ) %>% mutate( cum1 = as.numeric(cumsum(val)), cum2 = cumsum(cum1) ) } library(microbenchmark) microbenchmark(f_dt(),f_dplyr(),times = 100) expr min lq median uq max neval f_dt() 2.580121 2.97114 3.256156 4.318658 13.49149 100 f_dplyr() 10.792662 14.09490 15.909856 19.593819 159.80626 100
source share