Parallelize a python numpy.searchsorted loop using cython

I encoded a function using cython containing the following loop. Each row of array A1 is binary for all values ​​in array A2. Thus, each iteration of the loop returns a 2D array of index values. Arrays A1 and A2 are entered as function arguments, correctly typed.

Array C is pre-allocated at the highest level of indentation, as required by cython.

I simplified this question a bit.

...
cdef np.ndarray[DTYPEint_t, ndim=3] C = np.zeros([N,M,M], dtype=DTYPEint)

for j in range(0,N):
    C[j,:,:]  = np.searchsorted(A1[j,:], A2, side='left' )

So far so good, things are compiling and executing as expected. However, to get even more speed, I want to parallelize j-loop. The first attempt is to simply write

for j in prange(0,N, nogil=True):
    C[j,:,:]  = np.searchsorted(A1[j,:], A2, side='left' )

, nogil_function, , C.

" Python gil"

. , ?

EDIT:

setup.py

try:
    from setuptools import setup
    from setuptools import Extension
except ImportError:
    from distutils.core import setup
    from distutils.extension import Extension


from Cython.Build import cythonize

import numpy

extensions = [Extension("matchOnDistanceVectors",
                    sources=["matchOnDistanceVectors.pyx"],
                    extra_compile_args=["/openmp", "/O2"],
                    extra_link_args=[]
                   )]


setup(
ext_modules = cythonize(extensions),
include_dirs=[numpy.get_include()]
)

Windows 7, msvc. /openmp, 200 * 200. , ...

+4
2

, searchsorted GIL (. https://github.com/numpy/numpy/blob/e2805398f9a63b825f4a2aab22e9f169ff65aae9/numpy/core/src/multiarray/item_selection.c, 1664 "NPY_BEGIN_THREADS_DEF" ).

for j in prange(0,N, nogil=True):
    with gil:
      C[j,:,:]  = np.searchsorted(A1[j,:], A2, side='left' )

GIL, Python (, , ), searchsorted, .


, (A1.shape==(105,100), A2.shape==(302,302), ). 10 4,5 , 1,4 ( 4- ). 4- , .

. , , : 1) , -call/numpy; 2) OpenMP; 3) OpenMP.

+1

22. GIL numpy.searchsorted, GIL . nogil searchsorted:

cdef mySearchSorted(double[:] array, double target) nogil:
    # binary search implementation

for j in prange(0,N, nogil=True):
    for k in range(A2.shape[0]):
        for L in range(A2.shape[1]):
            C[j, k, L]  = mySearchSorted(A1[j, :], A2[k, L])

numpy.searchsorted , , N , searchsorted, .

0

Source: https://habr.com/ru/post/1626404/


All Articles