Repa nested array definitions leading to "Performing nested parallel computing sequentially ..."

As part of a larger problem, I am trying to define an array inside an array as follows:

import Data.Array.Repa type Arr = Array DIM2 Int arr = force $ fromList (Z :. 5 :. 5) [1..25] :: Arr combined :: Arr combined = arr `deepSeqArray` traverse arr (\_ -> Z :. 4 :. 4 :: DIM2) (\f (Z :. x :. y) -> let reg = force $ extract f (x,y) (2,2) in reg `deepSeqArray` sumAll reg) extract :: (DIM2 -> Int) -> (Int,Int) -> (Int,Int) -> Arr extract lookup (x0,y0) (width,height) = fromFunction bounds $ \sh -> offset lookup sh where bounds = Z :. width :. height offset :: (DIM2 -> Int) -> DIM2 -> Int offset f (Z :. x :. y) = f (Z :. x + x0 :. y + y0) main = print combined 

The extract function uses fromFunction and the search function provided to it, but it can also use traverse and arr ! ... arr ! ... for the same effect. Despite using force and deepSeqArray everywhere as early as possible, the console is populated with a message here, followed by the correct result:

Data.Array.Repa: Performing nested parallel computing sequentially. You probably called the function "force" while another instance was already running. This can happen if the second version has been suspended due to lazy pricing. Use 'deepSeqArray' to ensure that each array will be fully evaluated before you click on 'next'.

Until I created a version with lists for comparing speeds, performance suffers in greater productivity.

Is this just a consequence of the definition of nested arrays, and therefore, should I restructure my program for an internal or external definition, which should be a list? Is my extract function terrible and cause problems?

The tips from this question were useful to achieve this, but I have not yet scanned the compiled code.

+4
source share
1 answer

This is because 'print' implicitly also forces an array. The internal functions "force" and "sumAll" cause parallel computation, but "print", so you have nested parallelism. The fact that this is so unobvious is a big sadness in the Repa 2 API.

Turnip 3 solves these problems by exporting both serial and parallel versions of "force" and "sumAll", etc. It also adds a tag to the array type to indicate whether or not the array manifest is delayed. Repa 3 is not finished yet, but you can use the head version at http://code.ouroborus.net/repa . It should be short after GHC 7.4 this year.

Here is the version of your Repa 3 example that starts without warning about nested parallelism. Note that "strength" is now "calculated."

 import Data.Array.Repa arr :: Array U DIM2 Int arr = fromListUnboxed (Z :. 5 :. 5) [1..25] combined :: Array U DIM2 Int combined = computeP $ traverse arr (\_ -> Z :. 4 :. 4 :: DIM2) $ \f (Z :. x :. y) -> sumAllS $ extract f (x,y) (2,2) extract :: (DIM2 -> Int) -> (Int,Int) -> (Int,Int) -> Array D DIM2 Int extract lookup (x0,y0) (width,height) = fromFunction bounds $ \sh -> offset lookup sh where bounds = Z :. width :. height offset :: (DIM2 -> Int) -> DIM2 -> Int offset f (Z :. x :. y) = f (Z :. x + x0 :. y + y0) main = print combined 
+2
source

Source: https://habr.com/ru/post/1382305/


All Articles