Search for parallel trees

Question

Search for parallel trees

Say I have a lazy Tree whose leaves are possible solutions to the problem

 data Tree a = Node [Tree a] | Leaf (Maybe a)

I need to find only one solution (or find out that it is not).

I have a P-nuclear machine. Based on considerations of the efficient use of time and memory, it makes sense to search simultaneously for P different branches.

For example, suppose you have four branches of approximately the same computational complexity (corresponding to T seconds of processor time), and each of them has an answer.

If you evaluate all four branches truly in parallel on a dual-core machine, then they will all end in about 2T seconds.

If you evaluate only the first two branches and set aside the other two, you will get an answer in only T seconds, also using half as much memory.

My question is, is it possible to use any of the Haskell parallel infrastructure (Par monad, parallel Strategies, ...) for this, or do I need to use lower level tools like async?

+4

parallel-processing haskell

Roman cheplyaka Jun 26 '13 at 14:14

source share

2 answers

No less Par monads and strategies from the parallel package allow you to create only clean, unconditional parallel systems that look beautiful on such pictures:

  a
 / \
 bc
 \ / \
  de
  \ ...

In general, you really need unclean cross-thread communications:

 solve :: Tree a -> Maybe a smartPartition :: Tree a -> Int -> [[Tree a]] smartPartition tree P = ... -- split the tree in fairly even chunks, -- one per each machine core solveP :: Tree a -> IO (Maybe a) solveP tree = do resRef <- newIORef Nothing results <- parallel (map work (smartPartition tree)) return (msum results) where work [] = return Nothing work (t:ts) = do res <- readIORef resRef if (isJust res) then (return res) else do let tRes = solve t if (isNothing tRes) then (work ts) else do writeIORef tRes return tRes

However, if your calculations on one sheet are quite and equally costly, obscure strategies should not (I'm not sure) greatly affect performance:

 partitionLeafs :: Tree a -> Int -> [[Tree a]] solveP :: Tree a -> Maybe a solveP = msum . map step . transpose . partitionLeafs where step = msum . parMap rdeepseq solve

R.S. I feel that I understand the area of the problem no better than you (at least), so you probably already know all of the above. I wrote this answer for discussion because the question is very interesting to me.

+1

leventov Jun 26 '13 at 19:12

source share

Simon marlow · Accepted Answer · 2013-06-28T08:57:48+0000

Both strategies and Par monad will begin to evaluate a new parallel task if there is an available CPU, therefore in your example with four branches on a dual-core machine only two will be evaluated. In addition, the Strategies will perform other tasks after the response (although this may take some time).

However, if each of these two branches creates more tasks, then you probably would like to give priority to newer tasks (that is, in depth), but strategies will at least give priority to older tasks. The monarch Par Parat is supposed to prefer the newer ones (but I would have to check it out), however Par monad will evaluate all tasks before returning the answer, because that is how it provides determinism.

So, probably the only way to get this to work the way you want it to, at the moment, is to write your own scheduler for the monad Par.

Search for parallel trees

More articles: