Python: optimizing loops

I want to optimize python code consisting of two nested loops. I am not so familiar with numpy, but I understand that this should allow me to increase the effectiveness of such a task. Below is the test code that I wrote, which reflects what happens in real code. Currently, using the numpy range and iterator is slower than regular python. What am I doing wrong? What is the best solution to this problem?

Thank you for your help!

import numpy import time # setup a problem analagous to that in the real code npoints_per_plane = 1000 nplanes = 64 naxis = 1000 npoints3d = naxis + npoints_per_plane * nplanes npoints = naxis + npoints_per_plane specres = 1000 # this is where the data is being mapped to sol = dict() sol["ems"] = numpy.zeros(npoints3d) sol["abs"] = numpy.zeros(npoints3d) # this would normally be non-random input data data = dict() data["ems"] = numpy.zeros((npoints,specres)) data["abs"] = numpy.zeros((npoints,specres)) for ip in range(npoints): data["ems"][ip,:] = numpy.random.random(specres)[:] data["abs"][ip,:] = numpy.random.random(specres)[:] ems_mod = numpy.random.random(1)[0] abs_mod = numpy.random.random(1)[0] ispec = numpy.random.randint(specres) # this the code I want to optimize t0 = time.time() # usual python range and iterator for ip in range(npoints_per_plane): jp = naxis + ip for ipl in range(nplanes): ip3d = jp + npoints_per_plane * ipl sol["ems"][ip3d] = data["ems"][jp,ispec] * ems_mod sol["abs"][ip3d] = data["abs"][jp,ispec] * abs_mod t1 = time.time() # numpy ranges and iterator ip_vals = numpy.arange(npoints_per_plane) ipl_vals = numpy.arange(nplanes) for ip in numpy.nditer(ip_vals): jp = naxis + ip for ipl in numpy.nditer(ipl_vals): ip3d = jp + npoints_per_plane * ipl sol["ems"][ip3d] = data["ems"][jp,ispec] * ems_mod sol["abs"][ip3d] = data["abs"][jp,ispec] * abs_mod t2 = time.time() print "plain python: %0.3f seconds" % ( t1 - t0 ) print "numpy: %0.3f seconds" % ( t2 - t1 ) 

edit: put "jp = naxis + ip" only in the first one for the loop

additional note:

I developed how to get numpy to quickly execute the inner loop, but not the outer loop:

 # numpy vectorization for ip in xrange(npoints_per_plane): jp = naxis + ip sol["ems"][jp:jp+npoints_per_plane*nplanes:npoints_per_plane] = data["ems"][jp,ispec] * ems_mod sol["abs"][jp:jp+npoints_per_plane*nplanes:npoints_per_plane] = data["abs"][jp,ispec] * abs_mod 

Joe's solution below shows how to do both together, thanks!

+4
source share
1 answer

The best way to write loops in numpy is to not write loops and use vectorized operations instead. For instance:

 c = 0 for i in range(len(a)): c += a[i] + b[i] 

becomes

 c = np.sum(a + b, axis=0) 

For a and b with the form (100000, 100) it takes 0.344 seconds in the first embodiment and 0.062 seconds in the second.

In the case presented in your question, the following does what you want:

 sol['ems'][naxis:] = numpy.ravel( numpy.repeat( data['ems'][naxis:,ispec,numpy.newaxis] * ems_mod, nplanes, axis=1 ), order='F' ) 

This can be further optimized with some tricks , but it will reduce clarity and probably premature optimization, because:

simple python: 0.064 seconds

numpy: 0.002 seconds

The solution works as follows:

The original version contains jp = naxis + ip , which simply skips the first naxis elements [naxis:] selects everything except the first naxis elements. Your inner loop repeats the data[jp,ispec] for nplanes times and writes it to several locations ip3d = jp + npoints_per_plane * ipl , which is equivalent to shifting the offset 2D matrix by naxis . Therefore, the second dimension is added via numpy.newaxis to (previously 1D) data['ems'][naxis:, ispec] , the values ​​are repeated nplanes times in this new dimension via numpy.repeat . The resulting 2D array is then smoothed again through numpy.ravel (in Fortran order, that is, with the smallest axis with the smallest step) and written to the corresponding sol['ems'] subarray. If the target array was in fact 2D, repetition may be skipped by automatically broadcasting the array.

If you are faced with a situation where you cannot avoid using loops, you can use Cython (which supports efficient buffer representations on numpy arrays).

+6
source

Source: https://habr.com/ru/post/1492112/


All Articles