Nested parallel functions in R (

I am familiar with foreach, %dopar%and the like. I am also familiar with the option parallelfor cv.glmnet. But how do you set nested parallelism as shown below?

library(glmnet)
library(foreach)
library(parallel)
library(doSNOW)
Npar <- 1000
Nobs <- 200
Xdat <- matrix(rnorm(Nobs * Npar), ncol = Npar)
Xclass <- rep(1:2, each = Nobs/2)
Ydat <- rnorm(Nobs)

Parallel Cross Validation:

cl <- makeCluster(8, type = "SOCK")
registerDoSNOW(cl)
system.time(mods <- foreach(x = 1:2, .packages = "glmnet") %dopar% {
    idx <- Xclass == x
    cv.glmnet(Xdat[idx,], Ydat[idx], nfolds = 4, parallel = TRUE)
})
stopCluster(cl)

Unparallel Cross Validation:

cl <- makeCluster(8, type = "SOCK")
registerDoSNOW(cl)
system.time(mods <- foreach(x = 1:2, .packages = "glmnet") %dopar% {
    idx <- Xclass == x
    cv.glmnet(Xdat[idx,], Ydat[idx], nfolds = 4, parallel = FALSE)
})
stopCluster(cl)

For two system times, I get a very slight difference.

Is parallelism possible? Or do I need to explicitly use a nested statement?

Side question: if there are 8 cores in a cluster object and the cycle foreachcontains two tasks, will each core be given 1 core (and the remaining 6 kernels are idle) or will each core be given four kernels (using all 8 cores in total)? How can I request how many cores are currently in use?

+4
1

cv.glmnet , foreach, . foreach , foreach cv.glmnet.

doSNOW foreach, clusterCall:

cl <- makeCluster(2, type = "SOCK")
clusterCall(cl, function() {
  library(doSNOW)
  registerDoSNOW(makeCluster(2, type = "SOCK"))
  NULL
})
registerDoSNOW(cl)

doSNOW , , cv.glmnet , parallel=TRUE.

parallelism , ( ), . , , "" , foreach. doSNOW node, doMC ​​ .

, , parallelism. , .

+3

Source: https://habr.com/ru/post/1523313/


All Articles