Accelerate numro kronecker products

I am working on my first major python project. I have one function that has the following code in it:

# EXPAND THE EXPECTED VALUE TO APPLY TO ALL STATES, # THEN UPDATE fullFnMat EV_subset_expand = np.kron(EV_subset, np.ones((nrows, 1))) fullFnMat[key] = staticMat[key] + EV_subset_expand 

In my code profiler, it seems like this kronecker product actually takes a huge amount of time.

 Function was called by... ncalls tottime cumtime /home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/BellmanEquation.py:17(bellmanFn) <- 19 37.681 38.768 /home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/dpclient.py:467(solveTheModel) {numpy.core.multiarray.concatenate} <- 342 27.319 27.319 /usr/lib/pymodules/python2.7/numpy/lib/shape_base.py:665(kron) /home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/dpclient.py:467(solveTheModel) <- 1 11.041 91.781 <string>:1(<module>) {method 'argsort' of 'numpy.ndarray' objects} <- 19 7.692 7.692 /usr/lib/pymodules/python2.7/numpy/core/fromnumeric.py:597(argsort) /usr/lib/pymodules/python2.7/numpy/core/numeric.py:789(outer) <- 171 2.526 2.527 /usr/lib/pymodules/python2.7/numpy/lib/shape_base.py:665(kron) {method 'max' of 'numpy.ndarray' objects} <- 209 2.034 2.034 /home/stevejb/myhg/dpsolve/ootest/tests/ddw2011/profile_dir/dpclient.py:391(getValPolMatrices) 

Is there a way to get faster kronecker products in Numpy? It does not seem to take as much time as it is.

+6
source share
3 answers

You can, of course, take a look at the source for np.kron . It can be found in numpy/lib/shape_base.py and you can see if there are any improvements that can be made or simplifications that can make it more efficient. Alternatively, you can write your own using Cython or some other low-level language bindings to try to achieve better performance.

Or as @matt suggested something like the following, perhaps initially faster:

 import numpy as np nrows = 10 a = np.arange(100).reshape(10,10) b = np.tile(a,nrows).reshape(nrows*a.shape[0],-1) # equiv to np.kron(a,np.ones((nrows,1))) 

or

 b = np.repeat(a,nrows*np.ones(a.shape[0],np.int),axis=0) 

Timings:

 In [80]: %timeit np.tile(a,nrows).reshape(nrows*a.shape[0],-1) 10000 loops, best of 3: 25.5 us per loop In [81]: %timeit np.kron(a,np.ones((nrows,1))) 10000 loops, best of 3: 117 us per loop In [91]: %timeit np.repeat(a,nrows*np.ones(a.shape[0],np.int),0) 100000 loops, best of 3: 12.8 us per loop 

Using np.repeat for size arrays in the example above gives a pretty nice np.repeat speedup that is not too shabby.

+7
source

Perhaps np.kron() allocates memory and then you throw it away. Use np.tile() . I don’t know if it allocates more memory or plays indexing tricks under covers. If you only multiply EV_subset by units, you really don't need to call np.kron() .

+2
source

The following may help (in the general case, when one of the arrays is not "one"). For example, for two arrays A, B of the form (a, b, c) and (d, e, f); Summarize as needed.

Gets this in a single "multiply" op and quick swap.

 kprod = A[:,newaxis,:,newaxis,:,newaxis] * B[newaxis,:, newaxis,:, newaxis,:] # # kprod.shape = (a,d,b,e,c,f) now; is full outer product with desired arrangement # in memory. kprod.shape = (a*d,b*e,c*f) # reshape 'in place' 

(maybe it's kron (B, A) instead of kron (A, B), the opposite of A and B, if necessary)

+1
source

Source: https://habr.com/ru/post/895896/


All Articles