M The smallest values from the upper triangular matrix with their indices in the form of a list of tuples

Question

M The smallest values from the upper triangular matrix with their indices in the form of a list of tuples

I have np.ndarray as follows:

[[ inf 1. 3. 2. 1.] [ inf inf 2. 3. 2.] [ inf inf inf 5. 4.] [ inf inf inf inf 1.] [ inf inf inf inf inf]]

Is there a way to get the indices and m values of the smallest elements in this nd array? So, if I wanted the 4 smallest, that would be

 [(0,1,1),(0,4,1),(3,4,1),(0,3,2)]

where (string, col, val) is the designation above.

If there are several values, one of them is randomly selected. For example, there were 3 units, and then the next smallest value was 2, but (0,3,2), (1,2,2), (1,4,2) were all possible options.

Essentially, can I extract the k smallest values in this format from the upper triangular matrix efficiently (the matrix is much larger than the example above). I tried to flatten it using the square shape, nsmallest, but it's hard for me to get indexes and values for alignment. Thanks!

+5

python arrays numpy pandas min

Mike el jackson Feb 01 '17 at 1:25

source share

3 answers

Something like this works:

 import numpy as np a = np.random.rand(4,4) tuples = [(ix,iy, a[ix,iy]) for ix, row in enumerate(a) for iy, i in enumerate(row)] sorted(tuples,key=lambda x: x[2])[:10]

Where k = 10 ( [:10] ) from your question.

If you need only the top triangular elements, you can add a condition to the list comprehension:

 a = np.random.rand(4,4) tuples = [(ix,iy, a[ix,iy]) for ix, row in enumerate(a) for iy, i in enumerate(row) if ix<=iy] sorted(tuples,key=lambda x: x[2])

0

Bob baxley Feb 01 '17 at 1:44

source share

If my np.array () is n, I could get the n smallest values from it by smoothing it (using * np.ndenumerate ()) and using the heapq.heapify () and .smallest () methods, like so:

 #!python flattened = [(y,x) for x,y in np.ndenumerate(n)] # tuples reversed for natural sorting on values rather than co-ords heapq.heapify(flattened) results = heapq.nsmallest(4, flattened)

But this will use a lot of extra memory and will retrieve data and coordinate from arrays with Numpy efficiency into Python's own lists. So there are probably much better ways to do this in a more natural way in Python.

0

Jim dennis Feb 01 '17 at 2:40

source share

Divakar · Accepted Answer · 2017-02-01T01:54:21+0000

For the array Inf -

 r,c = np.unravel_index(a.ravel().argsort()[:4], a.shape) out = zip(r,c,a[r,c])

For performance, consider using np.argpartition . So, replace a.ravel().argsort()[:4] with np.argpartition(a.ravel(), range(4))[:4] .

Run Example -

 In [285]: a Out[285]: array([[ inf, 1., 3., 2., 1.], [ inf, inf, 2., 3., 2.], [ inf, inf, inf, 5., 4.], [ inf, inf, inf, inf, 1.], [ inf, inf, inf, inf, inf]]) In [286]: out Out[286]: [(0, 1, 1.0), (0, 4, 1.0), (3, 4, 1.0), (0, 3, 2.0)]

For the general case -

 R,C = np.triu_indices(a.shape[1],1) idx = a[R,C].argsort()[:4] r,c = R[idx], C[idx] out = zip(r,c,a[r,c])

Run Example -

 In [351]: a Out[351]: array([[ 68., 67., 81., 23., 16.], [ 84., 83., 20., 66., 48.], [ 58., 72., 98., 63., 30.], [ 61., 40., 1., 86., 22.], [ 29., 95., 38., 22., 95.]]) In [352]: out Out[352]: [(0, 4, 16.0), (1, 2, 20.0), (3, 4, 22.0), (0, 3, 23.0)]

For performance, consider using np.argpartition . So, replace a[R,C].argsort()[:4] with np.argpartition(a[R,C], range(4))[:4] .

M The smallest values ​​from the upper triangular matrix with their indices in the form of a list of tuples

More articles:

M The smallest values from the upper triangular matrix with their indices in the form of a list of tuples