I use a lot numba jitdecorator, and recently I realized that the new features added to the numba, in particular parallel, and stencildecorator.
The stencil is very good for creating cleaner code, but after several tests it seems that it is simply beautiful and not effective. Here is a sample code:
@numba.njit
def nb_jit(A, out):
for i in range(1, A.shape[0]-1):
out[i] = 0.5*(A[i+1] - A[i-1])
return out
@numba.njit(numba.float64[:](numba.float64[:], numba.float64[:]))
def nb_jit_typed(A, out):
for i in range(1, A.shape[0]-1):
out[i] = 0.5*(A[i+1] - A[i-1])
return out
@numba.njit(parallel=True)
def nb_jit_paral(A, out):
for i in numba.prange(1, A.shape[0]-1):
out[i] = 0.5*(A[i+1] - A[i-1])
return out
@numba.stencil
def s2(A):
return 0.5*(A[1] - A[-1])
@numba.njit
def nb_stencil(A):
return s2(A)
@numba.njit(parallel=True)
def nb_stencil_paral(A):
return s2(A)
I tested these functions with the following arrays:
import numpy as np
arr = np.random.rand(100000)
res = arr.copy()
and this gives me the following runtimes (of course, I executed each function at a time before the time!):
____________________________________________________
| %timeit nb_jit(arr, res) | 36 us |
| %timeit nb_jit_typed(arr, res) | 68 us |
| %timeit nb_jit_paral(arr, res) | 151 us |
| %timeit nb_stencil(arr) | 59 us |
| %timeit nb_stencil_paral(arr) | 241 us |
____________________________________________________
Therefore, I was wondering:
- Why
nb_jit_typedis it slower than nb_jit? In my memory, this was the last time I tested this. - Why
nb_jit_parallelis it so slow? - ? , , ?
:
import numba
numba.__version__
'0.37.0'
import multiprocessing
multiprocessing.cpu_count()
4
Edit:
10000 (1000000,) time.time() ( - ):
jit | 16.37 s
jit typed | 17.22 s
jit parallel | 18.45 s
stencil | 21.95 s
stencil paral | 24.48 s