I started messing around with parallel programming and cython / openmp, and I have a simple program that sums over an array using prange:
import numpy as np
from cython.parallel import prange
from cython import boundscheck, wraparound
@boundscheck(False)
@wraparound(False)
def parallel_summation(double[:] vec):
cdef int n = vec.shape[0]
cdef double total
cdef int i
for i in prange(n, nogil=True):
total += vec[i]
return total
It seems to work fine with the setup.py file. However, I was wondering if it is possible to configure this feature and have a little more control over what processors do.
Let's say I have 4 processors: I want to split a vector that will be summed into 4 parts, and then each processor locally add elements inside. Then, at the end, I can combine the results from each processor to get the total. From the cython documentation, I was not able to figure out if something like this is possible or not (the documentation is a bit sparse).
, - , / - cython/openmp, , , ( ).