Replacing values ​​that exceed a limit in a numpy array

I have an nxm array and maximum values ​​for each column. What is the best way to replace values ​​that exceed the maximum besides checking each item?

For instance:

def check_limits(bad_array, maxs): good_array = np.copy(bad_array) for i_line in xrange(bad_array.shape[0]): for i_column in xrange(bad_array.shape[1]): if good_array[i_line][i_column] >= maxs[i_column]: good_array[i_line][i_column] = maxs[i_column] - 1 return good_array 

Anyway, to make it faster and more concise?

+4
source share
4 answers

Use putmask :

 import numpy as np a = np.array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) m = np.array([7,6,5,4]) # This is what you need: np.putmask(a, a >= m, m - 1) # a is now: np.array([[0, 1, 2, 3], [4, 5, 4, 3], [6, 5, 4, 3]]) 
+8
source

If we don't think anything about the bad_array structure, your code will be the optimal argument of the opponents. If we know that each column is sorted in ascending order, then as soon as we reach a value exceeding max, we know that each next element in this column is also above the limit, but if we do not have such an assumption, we just have to check each of them.

If you decide to sort each column first, it will take time (n columns * nlogn), which is already longer than the time n * n required to check each element.

You can also create good_array by checking and copying one element at a time, instead of copying all the elements from bad_array and checking them later. This should roughly reduce the time by 0.5

0
source

If the number of columns is small, one optimization will be:

 def check_limits(bad_array, maxs): good_array = np.copy(bad_array) for i_column in xrange(bad_array.shape[1]): to_replace = (good_array[:,i_column] >= maxs[i_column]) good_array[to_replace, i_column] = maxs[i_column] - 1 return good_array 
0
source

Another way is to use the clip function:

using the eumiro example:

 bad_array = np.array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) maxs = np.array([7,6,5,4]) good_array = bad_array.clip(max=maxs-1) 

OR

 bad_array.clip(max=maxs-1, out=good_array) 

you can also specify a lower limit by adding the argument min =

0
source

Source: https://habr.com/ru/post/1345988/


All Articles