The most obvious problem is that you are the victim of one of the classic mistakes: do not prevail the output vector result
. Adding one value at a time can be very inefficient for large vectors.
In your case, the result
does not have to be a vector: you can accumulate the results in one value:
result = 0 for(g in 1:nrow(set)) {
But I think that the most important performance improvement you could make is to precompile the expressions that are currently being re-evaluated in the foreach
. You can do this with a separate foreach
. I also suggest using solve
in different ways to avoid a second matrix multiplication:
X_gamma_list <- foreach(g=1:nrow(set)) %dopar% { X_gamma <- X[, which(set[g,] != 0)] I - (c/(1+c)) * (X_gamma %*% solve(crossprod(X_gamma), t(X_gamma))) }
These calculations are now performed only once, and not once for each Y
column, which is 700 times less in your case.
In the same spirit, it makes sense to expand the expression ((1+c)^(-sum(set[g,])/2))
, as suggested by tim riffe, as well as -T / 2
, while we are in it:
a <- (1+c) ^ (-rowSums(set) / 2) nT2 <- -T / 2
To isplitCols
over the columns of the zoo
Y
object, I suggest using the isplitCols
function from the itertools
package. Make sure you download itertools
at the top of the script:
library(itertools)
isplitCols
allow you to send only those columns that are necessary for each task, and not send the entire object to all employees. The only trick is that you need to remove the dim
attribute from the resulting zoo
objects for your code to work, since isplitCols
uses drop=TRUE
.
Finally, here is the main foreach
:
denom <- foreach(Yi=isplitCols(Y, chunkSize=1), .packages='zoo') %dopar% { dim(Yi) <- NULL
Please note that I will not execute the inner loop in parallel. This would only make sense if there werenβt enough columns in Y
to keep all your processors busy. Parallelizing the inner loop can lead to tasks that are too short, effectively turn off computation, and make code much slower. It is much more important to efficiently execute the inner loop, since g
large.