Numpy: The fastest way to insert a value into an array so that the array is ordered

Suppose I have an array my_arrayand a singular value my_val. (Note that it is my_arrayalways sorted).

my_array = np.array([1, 2, 3, 4, 5])
my_val = 1.5

Since it my_valis 1.5, I want to put it between 1 and 2, giving me an array [1, 1.5, 2, 3, 4, 5].

My question is: what is the fastest way (i.e. in microseconds) of creating an ordered output array how my_arraydoes it grow arbitrarily large?

The original way I was was to combine the values ​​into a source array, and then sort:

arr_out = np.sort(np.concatenate((my_array, np.array([my_val]))))
[ 1.   1.5  2.   3.   4.   5. ]

I know it np.concatenateworks fast, but I'm not sure how it np.sortwill scale as it grows my_array, even if it is asked my_array.

Edit:

, , :

Input:

import timeit

timeit_setup = 'import numpy as np\n' \
               'my_array = np.array([i for i in range(1000)], dtype=np.float64)\n' \
               'my_val = 1.5'
num_trials = 1000

my_time = timeit.timeit(
    'np.sort(np.concatenate((my_array, np.array([my_val]))))',
    setup=timeit_setup, number=num_trials
)

pauls_time = timeit.timeit(
    'idx = my_array.searchsorted(my_val)\n'
    'np.concatenate((my_array[:idx], [my_val], my_array[idx:]))',
    setup=timeit_setup, number=num_trials
)

sanchit_time = timeit.timeit(
    'np.insert(my_array, my_array.searchsorted(my_val), my_val)',
    setup=timeit_setup, number=num_trials
)

print('Times for 1000 repetitions for array of length 1000:')
print("My method took {}s".format(my_time))
print("Paul Panzer method took {}s".format(pauls_time))
print("Sanchit Anand method took {}s".format(sanchit_time))

:

Times for 1000 repetitions for array of length 1000:
My method took 0.017865657746239747s
Paul Panzer method took 0.005813951002013821s
Sanchit Anand method took 0.014003945532323987s

100 1,000,000:

Times for 100 repetitions for array of length 1000000:
My method took 3.1770704101754195s
Paul Panzer method took 0.3931240139911161s
Sanchit Anand method took 0.40981490723551417s
+4
2

np.searchsorted, :

>>> idx = my_array.searchsorted(my_val)
>>> np.concatenate((my_array[:idx], [my_val], my_array[idx:]))
array([1. , 1.5, 2. , 3. , 4. , 5. ])

1: @Willem Van Onselm @hpaulj .

2: np.insert, @Sanchit Anand, , . , :

>>> def f_pp(my_array, my_val):
...      idx = my_array.searchsorted(my_val)
...      return np.concatenate((my_array[:idx], [my_val], my_array[idx:]))
... 
>>> def f_sa(my_array, my_val):
...      return np.insert(my_array, my_array.searchsorted(my_val), my_val)
...
>>> my_farray = my_array.astype(float)
>>> from timeit import repeat
>>> kwds = dict(globals=globals(), number=100000)
>>> repeat('f_sa(my_farray, my_val)', **kwds)
[1.2453778409981169, 1.2268288589984877, 1.2298014000116382]
>>> repeat('f_pp(my_array, my_val)', **kwds)
[0.2728819379990455, 0.2697303680033656, 0.2688361559994519]
+3

my_array = np.insert(my_array,my_array.searchsorted(my_val),my_val)

[EDIT] , float32 float64 .

+3

Source: https://habr.com/ru/post/1693606/


All Articles