Numpy: an efficient way to generate combinations from given ranges

I have an n-dimensional array as shown below:

np.array([[0,3],[0,3],[0,10]]) 

In this array, elements indicate low and high values. Example: [0,3] refers to [0,1,2,3]

I need to create a combination of all the values ​​using the ranges above. For example, I want [0,0,0], [0,0,1] ... [0,1,0] ... [3,3,10]

I tried the following to get what I want:

 ds = np.array([[0,3],[0,3],[0,10]]) nItems = int(reduce(lambda a,b: a * (b[1] - b[0] + 1), ds, 1)) myCombinations = np.zeros((nItems,)) nArrays = [] for x in range(ds.shape[0]): low = ds[x][0] high= ds[x][1] nitm = high - low + 1 ar = [x+low for x in range(nitm) ] nArrays.append(ar) myCombinations = cartesian(nArrays) 

The Cartesian function was taken from Using numpy to build an array of all combinations of two arrays

I need to do this several million times .

My question is: is there a better / efficient way to do this?

+1
source share
2 answers

I think you are looking for np.mgrid . Unfortunately, this returns an array in a format that is different from what you need, so you need to do a little post-processing:

 a = np.mgrid[0:4, 0:4, 0:11] # All points in a 3D grid within the given ranges a = np.rollaxis(a, 0, 4) # Make the 0th axis into the last axis a = a.reshape((4 * 4 * 11, 3)) # Now you can safely reshape while preserving order 

Explanation

np.mgrid gives you a set of grid points in N-dimensional space. Let me show this with a smaller example to make everything clearer:

 >>> a = np.mgrid[0:2, 0:2] >>> a array([[[0, 0], [1, 1]], [[0, 1], [0, 1]]]) 

Since I gave two sets of ranges, 0:2, 0:2 , I get a 2D mesh. What mgrid returns is the x and y values ​​corresponding to the grid points (0, 0), (0, 1), (1, 0) and (1, 1) in the 2D space. a[0] tells you what the x values ​​are for four points, and a[1] tells you what the y values ​​are.

But you really need a list of the actual grid points that I wrote out, and not the x and y values ​​of these points separately. The first instinct is simply to change the matrix as desired:

 >>> a.reshape((4, 2)) array([[0, 0], [1, 1], [0, 1], [0, 1]]) 

But this clearly does not work, because it effectively converts a flattened array (an array obtained by simply reading all the elements in order), and that is not what you want.

What you want to do is look down the third size of a and create an array:

 [ [a[0][0, 0], a[1][0, 0]], [a[0][0, 1], a[1][0, 1]], [a[0][1, 0], a[1][1, 0]], [a[0][1, 1], a[1][1, 1]] ] 

which reads: “First tell me the first point (x1, y1), then the second point (x2, y2), ...” etc. Perhaps this is better explained by the figure. Here's what a looks like:

  you want to read in this direction (0, 0) (0, 1) | | | | vv / 0--------0 +----> axis0 x-values | /| /| /| | / | / | axis1 / | \ 1--------1 | L | | | | | v / | 0-----|--1 axis2 y-values | | / | / | |/ |/ \ 0--------1 | | | | vv (1, 0) (1, 1) 

np.rollaxis gives you a way to do this. np.rollaxis(a, 0, 3) in the above example says: “Take the 0th (or outermost) axis and go to the last (or innermost) axis. (Note: only axes 0, 1 and 2 exist here. So “send the 0th axis to the third position” is a way of telling python to put the 0th axis after the last axis). You can also read this .

 >>> a = np.rollaxis(a, 0, 3) >>> a array([[[0, 0], [0, 1]], [[1, 0], [1, 1]]]) 

This starts to look the way you want, with the exception of the extra dimension of the array. We want to combine sizes 0 and 1 to get only one array of grid points. But now that the flattened array is being read as you expect, you can safely modify it to give the desired result.

 >>> a = a.reshape((4, 2)) >>> a array([[0, 0], [0, 1], [1, 0], [1, 1]]) 

The 3D version does the same, except that I could not make a figure for this, since it would be in 4D.

+8
source

You can use itertools.product :

 In [16]: from itertools import product In [17]: values = list(product(range(4), range(4), range(11))) In [18]: values[:5] Out[18]: [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4)] In [19]: values[-5:] Out[19]: [(3, 3, 6), (3, 3, 7), (3, 3, 8), (3, 3, 9), (3, 3, 10)] 

Given an array of ranges, you can do something like the following. (I used a couple of non-zero low values ​​to demonstrate the general case - and reduce the size of the output. :)

 In [41]: ranges = np.array([[0, 3], [1, 3], [8, 10]]) In [42]: list(product(*(range(lo, hi+1) for lo, hi in ranges))) Out[42]: [(0, 1, 8), (0, 1, 9), (0, 1, 10), (0, 2, 8), (0, 2, 9), (0, 2, 10), (0, 3, 8), (0, 3, 9), (0, 3, 10), (1, 1, 8), (1, 1, 9), (1, 1, 10), (1, 2, 8), (1, 2, 9), (1, 2, 10), (1, 3, 8), (1, 3, 9), (1, 3, 10), (2, 1, 8), (2, 1, 9), (2, 1, 10), (2, 2, 8), (2, 2, 9), (2, 2, 10), (2, 3, 8), (2, 3, 9), (2, 3, 10), (3, 1, 8), (3, 1, 9), (3, 1, 10), (3, 2, 8), (3, 2, 9), (3, 2, 10), (3, 3, 8), (3, 3, 9), (3, 3, 10)] 

If the low values ​​of all ranges are 0, you can use np.ndindex :

 In [52]: values = list(np.ndindex(4, 4, 11)) In [53]: values[:5] Out[53]: [(0, 0, 0), (0, 0, 1), (0, 0, 2), (0, 0, 3), (0, 0, 4)] In [54]: values[-5:] Out[34]: [(3, 3, 6), (3, 3, 7), (3, 3, 8), (3, 3, 9), (3, 3, 10)] 
+1
source

Source: https://habr.com/ru/post/1240354/


All Articles