Future launch of only part of the code in parallel

I have a question about future() , doFuture() .

I want to run N computations in parallel (using foreach ... %dopar% ) - N is the number of cores that I have on my machine. For this, I use future :

 library(doFuture) registerDoFuture() plan(multiprocess) foreach(i = seq_len(N)) %dopar% { foo <- rnorm(1e6) } 

This works like a charm, as I am doing parallel computations of N But I need to perform another step of the analysis, which uses a large number of cores (for example, N ). This is what the code looks like:

 foreach(i = seq_len(N)) %dopar% { foo <- rnorm(1e6) write.table(foo, paste0("file_", i, ".txt")) # This step uses high number of cores system(paste0("head ", "file_", i, ".txt", " > ", "file_head_", i, ".txt") } 

I run several rnorm and head in parallel, but since head uses a large number of kernels (let's say this), my analysis is stuck.

Question:

How to run only part of the code in parallel with the future ? (How to run only rnorm in parallel and then head serial)? Is there any solution not using another loop for this? Or maybe I need to switch to doSNOW or parallel ?

PS:

My real code looks something like this:

 library(doFuture) library(dplyr) registerDoFuture() plan(multiprocess) foreach(i = seq_len(N)) %dopar% { step1(i) %>% step2() %>% step3() %>% step4_RUN_SEQUENTIAL() %>% # I want to run this part not in parallel step5() # I want to run this part again in parallel } 

Reply to @Andrie comment:

  • future() is my way of doing parallel computing in R. I'm new to this and find it the easiest to use (compared to, for example, parallel::mcapply ). However, if it is possible to do what I want in doSNOW or parallel , then I am more than happy to switch
  • I know about this, but I'm looking for a one-cycle solution
+5
source share

Source: https://habr.com/ru/post/1272912/


All Articles