Make a numpy upper triangular matrix padded with Nan instead of zero

I am creating a matplotlib 3D surface chart. I only need to see the upper triangular half of the matrix on the graph, since the other half is redundant.

np.triu () does the excess half of the zeros of the matrix, but I would prefer that I can make them Nans, then these cells do not appear on the surface at all.

What would be the pythonic way to populate NaN instead of zeros? I cannot search and replace 0 with NaN, since null values ​​will appear in the legitimate data that I want to display.

+5
source share
3 answers

You can use numpy.tril_indices() to assign a NaN value to the lower triangle, for example:

 >>> import numpy as np >>> m = np.triu(np.arange(0, 12, dtype=np.float).reshape(4,3)) >>> m array([[ 0., 1., 2.], [ 0., 4., 5.], [ 0., 0., 8.], [ 0., 0., 0.]]) >>> m[np.tril_indices(m.shape[0], -1)] = np.nan >>> m array([[ 0., 1., 2.], [ nan, 4., 5.], [ nan, nan, 8.], [ nan, nan, nan]]) 
+9
source

tril_indices() may be the obvious approach here, which generates the lower triangular indices, and then you can use them to set in the input array up to NaNs .

Now, if you care about performance, you can use boolean indexing after creating a mask of such a lower triangular shape, and then set these to NaNs . The implementation will look like this:

 m[np.arange(m.shape[0])[:,None] > np.arange(m.shape[1])] = np.nan 

So np.arange(m.shape[0])[:,None] > np.arange(m.shape[1]) is a mask created using broadcasting .

Run Example -

 In [51]: m Out[51]: array([[ 11., 49., 23., 30.], [ 40., 41., 19., 26.], [ 32., 36., 30., 25.], [ 15., 27., 25., 40.], [ 33., 18., 45., 43.]]) In [52]: np.arange(m.shape[0])[:,None] > np.arange(m.shape[1]) # mask Out[52]: array([[False, False, False, False], [ True, False, False, False], [ True, True, False, False], [ True, True, True, False], [ True, True, True, True]], dtype=bool) In [53]: m[np.arange(m.shape[0])[:,None] > np.arange(m.shape[1])] = np.nan In [54]: m Out[54]: array([[ 11., 49., 23., 30.], [ nan, 41., 19., 26.], [ nan, nan, 30., 25.], [ nan, nan, nan, 40.], [ nan, nan, nan, nan]]) 

Runtime Tests -

This section compares the boolean indexing approach specified in this solution with np.tril_indices , based on the value of the other solution for performance.

 In [38]: m = np.random.randint(10,50,(1000,1100)).astype(float) In [39]: %timeit m[np.tril_indices(m.shape[0], -1)] = np.nan 10 loops, best of 3: 62.8 ms per loop In [40]: m = np.random.randint(10,50,(1000,1100)).astype(float) In [41]: %timeit m[np.arange(m.shape[0])[:,None] > np.arange(m.shape[1])] = np.nan 100 loops, best of 3: 8.03 ms per loop 
+1
source

The shape or layout does not matter for this example, so let's say that we have a 2D array a, which:

 >>> a array([[ 0., 0., 0.], [ 0., 0., 1.]]) 

and we want all 0 values ​​to be NaN. Just use list comprehension.

 >>> b = numpy.array([[i if i else numpy.nan for i in j] for j in a]) >>> b array([[ nan, nan, nan], [ nan, nan, 1.]]) 

If you have certain cells that are not equal to zero, specify them in the understanding.

-1
source

Source: https://habr.com/ru/post/1233303/


All Articles