How to apply numpy.argpartition output to 2-D arrays?

Question

How to apply numpy.argpartition output to 2-D arrays?

I have a pretty large 2d numpy array, and I want to extract the lowest 10 elements of each row, as well as their indices. Since my array is quite large, I would prefer not to sort the entire array.

I heard about the argpartition() function, with which I can get the indices of the least 10 elements:

 top10indexes = np.argpartition(myBigArray,10)[:,:10]

Note that argpartition() shares the -1 axis by default, and that is what I want. The result here has the same form as myBigArray containing the indices in the corresponding rows, so the first 10 indices indicate the 10 lowest values.

How can I extract the myBigArray elements matching these indices?

Obvious bizarre indexing like myBigArray[top10indexes] or myBigArray[:,top10indexes] does something completely different. I could also use lists, for example:

 array([row[idxs] for row,idxs in zip(myBigArray,top10indexes)])

but this will cause the performance to iterate over the numpy strings and convert the result back to an array.

nb: I could just use np.partition() to get the values, and they might even match the indexes (or maybe not ..), but I don't want to do this section twice if I can avoid it.

+6

performance python arrays numpy indexing

drevicko Oct 12 '14 at 5:38

source share

1 answer

Saullo castro · Accepted Answer · 2014-10-12T08:38:05+0000

You can avoid using flattened copies and having to extract all values by doing the following:

 num = 10 top = np.argpartition(myBigArray, num, axis=1)[:, :num] myBigArray[np.arange(myBigArray.shape[0])[:, None], top]

For NumPy> = 1.9.0 this will be very efficient and comparable to np.take() .

How to apply numpy.argpartition output to 2-D arrays?

More articles: