What is the best (fastest) way to implement the sliding function of a window with the data.table package?
I am trying to calculate the moving median, but has a few lines per day (due to two additional factors), which I think means that the zap rollapply function will not work. Here is an example of using a naive loop:
library(data.table) df <- data.frame( id=30000, date=rep(as.IDate(as.IDate("2012-01-01")+0:29, origin="1970-01-01"), each=1000), factor1=rep(1:5, each=200), factor2=1:5, value=rnorm(30, 100, 10) ) dt = data.table(df) setkeyv(dt, c("date", "factor1", "factor2")) get_window <- function(date, factor1, factor2) { criteria <- data.table( date=as.IDate((date - 7):(date - 1), origin="1970-01-01"), factor1=as.integer(factor1), factor2=as.integer(factor2) ) return(dt[criteria][, value]) } output <- data.table(unique(dt[, list(date, factor1, factor2)]))[, window_median:=as.numeric(NA)] for(i in nrow(output):1) { print(i) output[i, window_median:=median(get_window(date, factor1, factor2))] }
r time-series data.table sliding-window
alan Jul 26 '12 at 19:15 2012-07-26 19:15
source share