Is there any way to free memory when parallel computing in R

Suppose I want to run program R using multiple cores as follows

library(foreach)
library(doParallel)

no_cores <- detectCores() - 2

cl<-makeCluster(no_cores, outfile = "debug.txt")

registerDoParallel(cl)

result <- foreach(i = 10:100, 
        .combine = list,
        .multicombine = TRUE)  %dopar%  {

          set.seed(i)

          a <- replicate(i, rnorm(20)) 
          b <- replicate(i, rnorm(20))

          list(x = a + b, y = a - b)

        } 

However, I found that memory usage increased after the program started for a while. I think the program does not release the old object. So I tried using gc()as

result <- foreach(i = 10:100, 
        .combine = list,
        .multicombine = TRUE)  %dopar%  {

          set.seed(i)

          a <- replicate(i, rnorm(20)) 
          b <- replicate(i, rnorm(20))

          list(x = a + b, y = a - b)
         gc()

        } 

seems to work, but I don't get the result I want. And then I tried to collect garbage before each cycle, but it doesn't seem to work.

result <- foreach(i = 10:100, 
        .combine = list,
        .multicombine = TRUE)  %dopar%  {
          gc()
          set.seed(i)

          a <- replicate(i, rnorm(20)) 
          b <- replicate(i, rnorm(20))

          list(x = a + b, y = a - b)    
        } 

Is there any way to solve this problem? Thanks guys, any suggestion would be appreciated. PS. This code is for playback only, and my real simulation is much more complicated. Therefore, I do not want to change the structure of the program too much.

+4

Source: https://habr.com/ru/post/1675223/


All Articles