Suppose I want to run program R using multiple cores as follows
library(foreach)
library(doParallel)
no_cores <- detectCores() - 2
cl<-makeCluster(no_cores, outfile = "debug.txt")
registerDoParallel(cl)
result <- foreach(i = 10:100,
.combine = list,
.multicombine = TRUE) %dopar% {
set.seed(i)
a <- replicate(i, rnorm(20))
b <- replicate(i, rnorm(20))
list(x = a + b, y = a - b)
}
However, I found that memory usage increased after the program started for a while. I think the program does not release the old object. So I tried using gc()as
result <- foreach(i = 10:100,
.combine = list,
.multicombine = TRUE) %dopar% {
set.seed(i)
a <- replicate(i, rnorm(20))
b <- replicate(i, rnorm(20))
list(x = a + b, y = a - b)
gc()
}
seems to work, but I don't get the result I want. And then I tried to collect garbage before each cycle, but it doesn't seem to work.
result <- foreach(i = 10:100,
.combine = list,
.multicombine = TRUE) %dopar% {
gc()
set.seed(i)
a <- replicate(i, rnorm(20))
b <- replicate(i, rnorm(20))
list(x = a + b, y = a - b)
}
Is there any way to solve this problem? Thanks guys, any suggestion would be appreciated. PS. This code is for playback only, and my real simulation is much more complicated. Therefore, I do not want to change the structure of the program too much.