How to program a stencil with Dask

In many cases, scientists model the dynamics of the system using Stencil, this is a convolution of a mathematical operator on the grid. Usually this operation consumes a lot of computing resources. Here is a good explanation of the idea.

In numpy, the canonical way of programming a two-dimensional 5-point stencil is as follows:

for i in range(rows):
    for j in range(cols):
        grid[i, j] = ( grid[i,j] + grid[i-1,j] + grid[i+1,j] + grid[i,j-1] + grid[i,j+1]) / 5

Or, more efficiently, using slicing:

grid[1:-1,1:-1] = ( grid[1:-1,1:-1] + grid[0:-2,1:-1] + grid[2:,1:-1] + grid[1:-1,0:-2] + grid[1:-1,2:] ) / 5

However, if your grid is really big, it will not be fixed in your memory, or if the convolution operation is really complicated, it will take a lot of time, parallel programming methods are used to overcome these problems or simply to get the result faster. Tools like Dask allow scientists to program these simulations on their own, in parallel, almost transparently. Dask currently does not support element assignment, so how can I program a stencil using Dask.

+4
source share
2 answers

. , dask.array , . , numpy , .

, numpy numpy . .

def apply_stencil(x):
    out = np.empty_like(x)
    ...  # do arbitrary computations on out    
    return out

Dask , . , , . , dask.array.ghost, dask.array.map_overlap .

, map_overlap docstring - 1d

>>> x = np.array([1, 1, 2, 3, 3, 3, 2, 1, 1])
>>> x = from_array(x, chunks=5)
>>> def derivative(x):
...     return x - np.roll(x, 1)
>>> y = x.map_overlap(derivative, depth=1, boundary=0)
>>> y.compute()
array([ 1,  0,  1,  1,  0,  0, -1, -1,  0])
+2

Dask numpy, dask.array, , , :

grid = dask.array.zeros((100,100), chunks=(50,50))

100 100, 4 . , , . Dash ghost cells, .

:

  • ( )
  • -

,

import dask.array as da
grid = da.zeros((100,100), chunks=(50,50))
g = da.ghost.ghost(grid, depth={0:1,1:1}, boundary={0:0,1:1})
g2 = g.map_blocks( some_function ) 
s = da.ghost.trim_internals(g2, {0:1,1:1})
s.compute()

, Dask , s.compute(). MRocklin, numpy.

dask.array dask.theated scheduler , , , , , , , . dask , dask.multiprocessing:

import dask.multiprocessing
import dask

dask.set_options(get=dask.multiprocessing.get)

compute(), Dask python, , , , dask.multiprocessing . Dask .

+1
source

Source: https://habr.com/ru/post/1658131/


All Articles