Creating a function for .combine in foreach

I have a process that I want to do in parallel, but I cannot because of some strange error . Now I am considering combining and computing an unsuccessful task on the main CPU. However, I do not know how to write such a function for .combine.

How should this be written?

I know how to write them, for example, this answer gives an example, but it does not provide how to handle unsuccessful tasks, nor repeat the task to the master.

I would do something like:

foreach(i=1:100, .combine = function(x, y){tryCatch(?)} %dopar% { long_process_which_fails_randomly(i) } 

However, how can I use the input of this task in a .combine function (if this can be done)? Or should I provide inside %dopar% to return a flag or list to compute it?

+5
source share
1 answer

To perform tasks in the union function, you need to include additional information in the result object returned by the body of the foreach loop. In this case, it will be an error flag and a value of i . There are many ways to do this, but here is an example:

 comb <- function(results, x) { i <- x$i result <- x$result if (x$error) { cat(sprintf('master computing failed task %d\n', i)) # Could call function repeatedly until it succeeds, # but that could hang the master result <- try(fails_randomly(i)) } results[i] <- list(result) # guard against a NULL result results } r <- foreach(i=1:100, .combine='comb', .init=vector('list', 100)) %dopar% { tryCatch({ list(error=FALSE, i=i, result=fails_randomly(i)) }, error=function(e) { list(error=TRUE, i=i, result=e) }) } 

I will be tempted to solve this problem by repeating a parallel loop until all tasks are calculated:

 x <- rnorm(100) results <- lapply(x, function(i) simpleError('')) # Might want to put a limit on the number of retries repeat { ix <- which(sapply(results, function(x) inherits(x, 'error'))) if (length(ix) == 0) break cat(sprintf('computing tasks %s\n', paste(ix, collapse=','))) r <- foreach(i=x[ix], .errorhandling='pass') %dopar% { fails_randomly(i) } results[ix] <- r } 

Note that this solution uses the .errorhandling parameter, which is very useful if errors can occur. For more information about this option, see the foreach man page.

+2
source

Source: https://habr.com/ru/post/1260289/


All Articles