Find index positions where a 3D array satisfies MULTIPLE conditions

I have a three-dimensional array consisting of several numbers in each strip. Is there a function that returns index positions where the array satisfies the MULTIPLE conditions?

I tried the following:

index_pos = numpy.where( array[:,:,0]==10 and array[:,:,1]==15 and array[:,:,2]==30) 

It returns an error:

 ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() 
+6
source share
3 answers

You really have a special case where it would be easier and more efficient to do the following:

Create data:

 >>> arr array([[[ 6, 9, 4], [ 5, 2, 1], [10, 15, 30]], [[ 9, 0, 1], [ 4, 6, 4], [ 8, 3, 9]], [[ 6, 7, 4], [ 0, 1, 6], [ 4, 0, 1]]]) 

Expected Value:

 >>> index_pos = np.where((arr[:,:,0]==10) & (arr[:,:,1]==15) & (arr[:,:,2]==30)) >>> index_pos (array([0]), array([2])) 

Use broadcast to do this:

 >>> arr == np.array([10,15,30]) array([[[False, False, False], [False, False, False], [ True, True, True]], [[False, False, False], [False, False, False], [False, False, False]], [[False, False, False], [False, False, False], [False, False, False]]], dtype=bool) >>> np.where( np.all(arr == np.array([10,15,30]), axis=-1) ) (array([0]), array([2])) 

If the indexes you want are not contiguous, you can do something like this:

 ind_vals = np.array([0,2]) where_mask = (arr[:,:,ind_vals] == values) 

Broadcast whenever you can.

Influenced by @Jamie's comment, some interesting things to consider:

 arr = np.random.randint(0,100,(5000,5000,3)) %timeit np.all(arr == np.array([10,15,30]), axis=-1) 1 loops, best of 3: 614 ms per loop %timeit ((arr[:,:,0]==10) & (arr[:,:,1]==15) & (arr[:,:,2]==30)) 1 loops, best of 3: 217 ms per loop %timeit tmp = (arr == np.array([10,15,30])); (tmp[:,:,0] & tmp[:,:,1] & tmp[:,:,2]) 1 loops, best of 3: 368 ms per loop 

The question is, why is this ?:

First learn:

 %timeit (arr[:,:,0]==10) 10 loops, best of 3: 51.2 ms per loop %timeit (arr == np.array([10,15,30])) 1 loops, best of 3: 300 ms per loop 

One would expect that arr == np.array([10,15,30]) would in the worst case be 1/3 of the speed arr[:,:,0]==10 . Does anyone have an idea why this is not so?

Then when combining the final axis, there are many ways to do this.

 tmp = (arr == np.array([10,15,30])) method1 = np.all(tmp,axis=-1) method2 = (tmp[:,:,0] & tmp[:,:,1] & tmp[:,:,2]) method3 = np.einsum('ij,ij,ij->ij',tmp[:,:,0] , tmp[:,:,1] , tmp[:,:,2]) np.allclose(method1,method2) True np.allclose(method1,method3) True %timeit np.all(tmp,axis=-1) 1 loops, best of 3: 318 ms per loop %timeit (tmp[:,:,0] & tmp[:,:,1] & tmp[:,:,2]) 10 loops, best of 3: 68.2 ms per loop %timeit np.einsum('ij,ij,ij->ij',tmp[:,:,0] , tmp[:,:,1] , tmp[:,:,2]) 10 loops, best of 3: 38 ms per loop 

The einsum speed is well defined elsewhere , but it seems strange to me that there is a difference between all and consecutive & .

+7
source

In this case, the and operator will not work.

 index_pos = numpy.where(array[:,:,0]==10 and array[:,:,1]==15 and array[:,:,2]==30) 

Try:

 index_pos = numpy.where((array[:,:,0]==10) & (array[:,:,1]==15) & (array[:,:,2]==30)) 
+5
source

The problem is using the built-in Python and keyword, which does not behave as we would like on arrays.

Instead, try using the numpy.logical_and function.

 cond1 = np.logical_and(array[:,:,0]==10, array[:,:,1]==15) cond2 = np.logical_and(cond1, array[:,:,2]==30) index_pos = numpy.where(cond2) 

You can even create your own version of logical_and , which accepts an arbitrary number of conditions:

 def my_logical_and(*args): return reduce(np.logical_and, args) condition_locs_and_vals = [(0, 10), (1, 15), (2, 30)] conditions = [array[:,:,x] == y for x,y in conditition_locs_and_vals] my_logical_and(*conditions) 

Using bitwise and ( & ) works, but only by coincidence. Bitwise - and used to compare bit or bool types. Using this to compare the true values ​​of numeric arrays is not reliable (for example, if you suddenly need to index in places where the record is evaluated to True , and not first converted to a bool array). logical_and really should be used instead of & (even if it comes with a speed penalty).

Also, combining arbitrary condition lists with & can be painful for both reading and typing. And to reuse the code, so that later programmers do not need to change around the heap of subordinate sentences to the & operator, it may be better to keep separate conditions separately, and then use a function like the one above to combine them.

+4
source

Source: https://habr.com/ru/post/957350/


All Articles