The number of sum elements in an array based on its value

Question

The number of sum elements in an array based on its value

I have an unsorted array of indices:

i = np.array([1,5,2,6,4,3,6,7,4,3,2])

I also have an array of values of the same length:

v = np.array([2,5,2,3,4,1,2,1,6,4,2])

I have an array with zeros of the desired values:

d = np.zeros(10)

Now I want to add to the elements the values of d from v based on this index in i.

If I do this in simple python, I would do it like this:

for index,value in enumerate(v):
    idx = i[index]
    d[idx] += v[index]

It is ugly and inefficient. How can I change it?

+4

performance python arrays numpy

Mihail kondratyev Nov 20 '15 at 2:41

source share

2 answers

np.add.at(d, i, v)

, d[i] += v , , . ufunc.at .

+5

user2357112 20 . '15 2:47

Divakar · Accepted Answer · 2015-11-20T07:25:25+0000

We can use np.bincount, which is supposedly quite effective for such a cumulative weighted calculation, so here is one with this -

counts = np.bincount(i,v)
d[:counts.size] = counts

, minlength , d , -

d += np.bincount(i,v,minlength=d.size).astype(d.dtype, copy=False)

np.add.at, other post, np.bincount, .

In [61]: def bincount_based(d,i,v):
    ...:     counts = np.bincount(i,v)
    ...:     d[:counts.size] = counts
    ...: 
    ...: def add_at_based(d,i,v):
    ...:     np.add.at(d, i, v)
    ...:     

In [62]: # Inputs (random numbers)
    ...: N = 10000
    ...: i = np.random.randint(0,1000,(N))
    ...: v = np.random.randint(0,1000,(N))
    ...: 
    ...: # Setup output arrays for two approaches
    ...: M = 12000
    ...: d1 = np.zeros(M)
    ...: d2 = np.zeros(M)
    ...: 

In [63]: bincount_based(d1,i,v) # Run approaches
    ...: add_at_based(d2,i,v)
    ...: 

In [64]: np.allclose(d1,d2)  # Verify outputs
Out[64]: True

In [67]: # Setup output arrays for two approaches again for timing
    ...: M = 12000
    ...: d1 = np.zeros(M)
    ...: d2 = np.zeros(M)
    ...: 

In [68]: %timeit add_at_based(d2,i,v)
1000 loops, best of 3: 1.83 ms per loop

In [69]: %timeit bincount_based(d1,i,v)
10000 loops, best of 3: 52.7 µs per loop

The number of sum elements in an array based on its value

More articles: