I have a question about future() , doFuture() .
I want to run N computations in parallel (using foreach ... %dopar% ) - N is the number of cores that I have on my machine. For this, I use future :
library(doFuture) registerDoFuture() plan(multiprocess) foreach(i = seq_len(N)) %dopar% { foo <- rnorm(1e6) }
This works like a charm, as I am doing parallel computations of N But I need to perform another step of the analysis, which uses a large number of cores (for example, N ). This is what the code looks like:
foreach(i = seq_len(N)) %dopar% { foo <- rnorm(1e6) write.table(foo, paste0("file_", i, ".txt")) # This step uses high number of cores system(paste0("head ", "file_", i, ".txt", " > ", "file_head_", i, ".txt") }
I run several rnorm and head in parallel, but since head uses a large number of kernels (let's say this), my analysis is stuck.
Question:
How to run only part of the code in parallel with the future ? (How to run only rnorm in parallel and then head serial)? Is there any solution not using another loop for this? Or maybe I need to switch to doSNOW or parallel ?
PS:
My real code looks something like this:
library(doFuture) library(dplyr) registerDoFuture() plan(multiprocess) foreach(i = seq_len(N)) %dopar% { step1(i) %>% step2() %>% step3() %>% step4_RUN_SEQUENTIAL() %>%
Reply to @Andrie comment:
future() is my way of doing parallel computing in R. I'm new to this and find it the easiest to use (compared to, for example, parallel::mcapply ). However, if it is possible to do what I want in doSNOW or parallel , then I am more than happy to switch- I know about this, but I'm looking for a one-cycle solution