Good! I found a problem. Because Seberg pointed out that memory looks slower because the measurement included automatic conversion from a numpy array to memory.
I used the following function to measure time from a cython module:
def test(params): import timeit im = params[0] pd = params[1] box_half_size = params[2] t1 = timeit.Timer(lambda: image_box1(im, pd, box_half_size)) print 'image_box1: typed numpy:' print min(t1.repeat(3, 10)) cdef np.uint8_t[:, ::1] im2 = im cdef np.float64_t[:] pd2 = pd t2 = timeit.Timer(lambda: image_box2(im2, pd2, box_half_size)) print 'image_box2: memoryview:' print min(t2.repeat(3, 10))
result:
image_box1: printed numpy: 9.07607864065e-05
image_box2: memoryview: 5.81799904467e-05
So, memory is really faster!
Note that I converted im and pd to memoryviews before calling image_box2. If I do not take this step, and I pass im and pd directly, then image_box2 will be slower:
image_box1: printed numpy: 9.12262257771e-05
image_box2: memoryview: 0.000185245087778
source share