Unnecessary nonzero / flatnonzero index order; order of returned items in boolean indexing

Question

Unnecessary nonzero / flatnonzero index order; order of returned items in boolean indexing

I am interested to know the order of indexes returned by numpy.nonzero / numpy.flatnonzero.

I could not find anything in the docs about this. He just says:

A[nonzero(flag)] == A[flag]

In most cases, this is enough; there are some when you need a sorted list of indexes. Is it guaranteed that returned indexes are sorted in case of 1-D, or do I need to sort them explicitly? (A similar question is the order of elements returned simply by selecting with a boolean array (A [flag]), which should be the same as in the documents.)

Example: searching for "spaces" between True elements in a flag:

 flag=np.array([True,False,False,True],dtype=bool) iflag=flatnonzero(flag) gaps= iflag[1:] - iflag[:-1]

Thanks.

+4

python numpy

ovga Mar 14 '13 at 14:29

source share

1 answer

senderle · Answer 1 · 2013-03-14T15:02:46+0000

Given the specification of extended (or "fantasy") indexing with integers , the guarantee that A[nonzero(flag)] == A[flag] also a guarantee that the values are sorted from low to high in the 1st case. However, in higher dimensions, the result (while “sorted”) has a different structure than you might expect.

In short, given a 1-dimensional array of integers ind and a 1-dimensional array x for all valid i defined for ind , we have the following:

 result[i] = x[ind[i]]

result takes the form ind and contains the values of x at the indices indicated by ind . This means that we can deduce that if x[flag] supports the original order x , and if x[nonzero(flag)] matches x[flag] , then nonzero(flag) should always produce indexes in sorted order.

The only catch is that for multidimensional arrays, indexes are stored as different arrays for each indexed dimension. In other words,

 x[array([0, 1, 2]), array([0, 0, 0])]

equally

 array([x[0, 0], x[1, 0], x[2, 0]])

Values are still sorted, but each dimension is split into its own array. (As a result, you can do interesting things with the broadcast, but that is beyond the scope of this answer.)

The only problem with this logic is that, to my great surprise, I cannot find an explicit operator to ensure that Boolean indexing preserves the original order of the array. Nevertheless, I am absolutely sure that this is so. More generally, it would be incredibly perverse if x[[True, True, True]] return the reverse version of x .

Unnecessary nonzero / flatnonzero index order; order of returned items in boolean indexing

More articles: