Vectorizing a for loop in numpy / scipy?

Question

Vectorizing a for loop in numpy / scipy?

I am trying to vectorize a for loop that I have inside a class method. The for loop has the following form: it iterates through a bunch of points and depending on whether a certain variable is true (called "self.condition_met"), it calls a couple of functions at a point and adds the result to the list. Each point here is an element in a vector lists, that is, a data structure that looks like an array ([[1,2,3], [4,5,6], ...]). Here is the problematic function:

def myClass: def my_inefficient_method(self): final_vector = [] # Assume 'my_vector' and 'my_other_vector' are defined numpy arrays for point in all_points: if not self.condition_met: a = self.my_func1(point, my_vector) b = self.my_func2(point, my_other_vector) else: a = self.my_func3(point, my_vector) b = self.my_func4(point, my_other_vector) c = a + b final_vector.append(c) # Choose random element from resulting vector 'final_vector'

self.condition_met is set before my_inefficient_method is called, so you don't need to check it every time, but I'm not sure how best to write this. Since there are no destructive operations here, it looks like I could rewrite this whole thing as a vector operation - is this possible? any ideas how to do this?

+4

optimization python vectorization numpy scipy

user248237dfsf Apr 19 '10 at 19:17

source share

3 answers

mtrw · Answer 1 · 2010-04-19T19:35:48+0000

Is it possible to rewrite my_funcx for vectorization? If so, you can do

 def myClass: def my_efficient_method(self): # Assume 'all_points', 'my_vector' and 'my_other_vector' are defined numpy arrays if not self.condition_met: a = self.my_func1(all_points, my_vector) b = self.my_func2(all_points, my_other_vector) else: a = self.my_func3(all_points, my_vector) b = self.my_func4(all_points, my_other_vector) final_vector = a + b # Choose random element from resulting vector 'final_vector'

doug · Answer 2 · 2010-04-19T19:52:09+0000

It takes just a couple of lines of code in NumPy (the rest just creates a data set, a couple of functions and settings).

 import numpy as NP # create two functions fnx1 = lambda x : x**2 fnx2 = lambda x : NP.sum(fnx1(x)) # create some data M = NP.random.randint(10, 99, 40).reshape(8, 5) # creates index array based on condition satisfaction # (is the sum (of that row/data point) even or odd) ndx = NP.where( NP.sum(M, 0) % 2 == 0 ) # only those data points that satisfy the condition (are even) # are passed to one function then another and the result off applying both # functions to each data point is stored in an array res = NP.apply_along_axis( fnx2, 1, M[ndx,] ) print(res) # returns: [[11609 15309 15742 12406 4781]]

From your description, I diverted this thread:

check condition (boolean) 'if True'
calls paired functions on this data of points (rows) that satisfy the state
adds the result of each set of calls to the list ('res' below)

Casey W. stark · Answer 3 · 2010-04-19T19:45:06+0000

It is probably best to do what mtrw is, but if you are not sure about vectorization, you can try numpy.vectorize on my_func s

http://docs.scipy.org/doc/numpy/reference/generated/numpy.vectorize.html

Vectorizing a for loop in numpy / scipy?

More articles: