I am trying to write python code that calls the following cython function test1 as follows:
def test1( np.ndarray[np.int32_t, ndim=2] ndk, np.ndarray[np.int32_t, ndim=2] nkw, np.ndarray[np.float64_t, ndim=2] phi): for _ in xrange(int(1e5)): test2(ndk, nkw, phi) cdef int test2(np.ndarray[np.int32_t, ndim=2] ndk, np.ndarray[np.int32_t, ndim=2] nkw, np.ndarray[np.float64_t, ndim=2] phi): return 1
my clean python code will call test1 and pass 3 numpy arrays as parameters, and they are very large (around 10 ^ 4 * 10 ^ 3). Test1, in turn, will call test2, which is defined by the cdef keywords and passes these arrays. Since test1 needs to call test2 many times (about 10 ^ 5) before it returns, and test2 does not need to be called outside of cython code, I use cdef instead of def .
But the problem is that every time test1 calls test2, the memory starts to grow steadily. I tried calling gc.collect() outside of this cython code, but it does not work. And finally, the program will be killed by the system because it ate all the memories. I noticed that this problem only occurs with the cdef and cpdef functions , and if I change it to def , it works fine.
I think that test1 should pass references of these arrays to test2 instead of an object. But it looks like it creates new objects of these arrays and passes them to test2, and these objects are never affected by python gc afterwards.
Did I miss something?
source share