Indexing numpy record arrays is very slow

Question

Indexing numpy record arrays is very slow

It seems like indexing numpy record arrays with an index array is excessively slow. However, the same operation can be performed using np.view10-15 times faster.

Is there a reason for this difference? Why isn't indexing writable arrays faster? (see also sorting multilevel structured and writable arrays very slowly )

mydtype = np.dtype("i4,i8")
mydtype.names = ("foo","bar")
N = 100000

foobar = np.zeros(N,dtype = mydtype)
foobar["foo"] = np.random.randint(0,100,N)
foobar["bar"] = np.random.randint(0,10000,N)

b = np.lexsort((foobar["foo"],foobar["bar"]))

timeit foobar[b]
100 loops, best of 3: 11.2 ms per loop

timeit foobar.view("|S12")[b].view(mydtype)
1000 loops, best of 3: 882 µs per loop

Obviously, both results give the same answer.

+4

performance numpy

Maxim imakaev Dec 26 '14 at 17:07

source share

1 answer

hpaulj · Accepted Answer · 2014-12-28T08:25:46+0000

takeas stated in fooobar.com/questions/416105 / ... , even faster than your dual approach:

np.take(foobar,b)

,

foobar['foo'][b]

https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/item_selection.c , .

, - , __getitem__ , . , , dtype ( ).

. .

Indexing numpy record arrays is very slow

More articles: