I want to speed up the following algorithm. I provide functions to the xts time series, and then I want to analyze the main components for each time point at the previous X points (I am using 500 now), and then use the results of this PCA (5 main components in the following code) to calculate some value. Something like that:
lookback <- 500 for(i in (lookback+1):nrow(x)) { x.now <- x[(i-lookback):i] x.prcomp <- prcomp(x.now) ans[i] <- (some R code on x.prcomp) }
I suppose this would require me to replicate the reverse lookup rows as columns so that x something like cbind(x,lag(x),lag(x,k=2),lag(x,k=3)...lag(x,k=lookback)) and then ran prcomp on every line? It seems expensive though. Perhaps some apply option? I'm ready to take a peek at Rcpp, but I wanted you to do this before, guys.
Edit: Thanks for the answers. Information about my dataset / algorithm:
- dim (x.xts) currently = 2000x24. But in the end, if it means a promise, he will have to work fast (I will give several data sets).
- func (x.xts) takes ~ 70 seconds. These are 2000-500 prcomp calls with 1500 500x24 frame creation.
I tried using Rprof to find out which one is the most expensive part, but the first time I use Rprof , so I need more experience with this tool to get clear results (thanks for the suggestion).
I think that I will first try to flip this into a loop of type _apply, and then look at parallelization.
source share