Anyway, to make a numpy array operation faster?

Question

Anyway, to make a numpy array operation faster?

Suppose I have a matrix:

A = [[2, 1]
     [1, 2]]

And a list of matrices:

B = [[1, 0]   C = [[2, 1],  D = [[0, 0],  E = [[1, 0],
     [1, 0]]       [0, 0]]       [0, 0]]       [0, 0]]

First I want to smooth A.flatten() = [2 1 1 2]. Then we obtain the sum of these components multiplied by B, C, Dand Erespectively. So:

A[0] * B + A[1]*C + A[2]*D + A[3]*E

Now consider a more general case:

A[0] * X_1 + A[1] * X_2 + ... + A[n-1] * X_n

Where X_ncan have any dimension. This is the code I came up with for this:

import numpy as np
from functools import reduce
from operator import mul

def product(iterable):
    return reduce(mul, iterable)


def create_table(old_shape, new_shape):
    # Create X_1, X_2, ..., X_n
    lookup = []
    for _ in range(product(old_shape)):
        lookup.append(np.random.rand(*new_shape))
    return lookup


def sum_expansion(arr, lookup, shape):
    # A[0] * X_1 + ... + A[n-1] * X_n
    new_arr = np.zeros(shape)
    for i, a in enumerate(arr.flatten()):
        new_arr += a * lookup[i]

    return new_arr

if __name__ == '__main__':
    lookup = create_table((2, 2), (3, 3, 3))
    # Generate random 2 x 2 matrices.
    randos = (np.random.rand(2, 2) for _ in range(100000))
    results = map(lambda x: sum_expansion(x, lookup, (3, 3, 3)), randos)
    print(list(results))

It takes about 74 seconds to execute this code on my machine. Is there a way to reduce the time this code takes?

+4

python numpy

Dair Mar 18 '17 at 3:29

source share

3 answers

In [20]: randos = [np.random.rand(2, 2) for _ in range(10)]

In [21]: timeit [sum_expansion(x,lookup,(3,3,3)) for x in randos]                                                       10000 loops, best of 3: 184 µs per loop

. sum_expansion 18 .

In [22]: timeit create_table((2,2),(3,3,3))                                                                             
100000 loops, best of 3: 14.1 µs per loop

, , . Python numpy.

3- , einsum, :

def ein_expansion(arr, lookup, shape):                                                                                      
    return np.einsum('ij,ij...',arr, lookup) 

In [45]: L = np.array(lookup).reshape(2,2,3,3,3)

In [43]: timeit [ein_expansion(r, L,(3,3,3)) for r in randos]                                                           
10000 loops, best of 3: 58.3 µs per loop

, randos.

 In [59]: timeit np.einsum('oij,ij...->o...',np.array(randos),L)                                                         
 100000 loops, best of 3: 15.8 µs per loop   

 In [60]: np.einsum('oij,ij...->o...',np.array(randos),L).shape                                                           
 Out[60]: (10, 3, 3, 3)

+2

hpaulj 18 . '17 4:56

, :

import numpy as np


def do_sum(x, mat_lst):
    a = np.array(x).flatten().reshape(1, -1)
    print('A shape: ', a.shape)
    b = np.stack(mat_lst)
    print('B shape: ', b.shape)
    return np.einsum('ij,jkl->kl', a, b)

A = [[1,2],[3,4]]
B = [[[1,1],[1,1]],[[2,2],[2,2]],[[3,3],[3,3]],[[4,4],[4,4]]]

do_sum(A,B)

A shape:  (1, 4)
B shape:  (4, 2, 2)

[[30 30]
 [30 30]]

-

n-d. , x mat_lst.

def do_sum(x, mat_lst):
    a = np.array(x).flatten()
    b = np.stack(mat_lst)
    print("A shape: {}\nB shape: {}".format(a.shape, b.shape))
    return np.einsum('i,i...', a, b)

A = [[1,2],[3,4]]
B = [np.random.rand(2,2,2) for _ in range(4)]
do_sum(A,B)

(: , , , , ( , (1x3), (3) .) , .)

, . , a.shape = (n,) b.shape = (n,...), a b. b , , ... . (s) , (.. ...).

The index string passed in einsumcaptures all this information. On the input side of the line (everything to the left of ->), we mark the indices for each operand (i.e., the input matrices aand b), separated by commas. The indices for summation are repeated (i.e. i). On the output side of the line (to the right of ->) we indicate the output indices. Our function does not need an output line, because we want to display all dimensions not included in the summation (I think).

+2

Crisppin Mar 18 '17 at 4:57

source share

Divakar · Accepted Answer · 2017-03-18T06:06:30+0000

, , np.tensordot randos, , -

np.tensordot(np.array(randos).reshape(-1,4),lookup, axes=((-1),(0)))

, np.tensordot -

lookup_arr = np.asarray(lookup).reshape(2,2,3,3,3)
out = np.tensordot(randos,lookup_arr,axes=((-2,-1),(0,1)))

-

In [69]: randos = [np.random.rand(2, 2) for _ in range(100)]

In [73]: lookup = create_table((2, 2), (3, 3, 3))

In [74]: lookup_arr = np.asarray(lookup).reshape(2,2,3,3,3)

In [75]: out1 = np.tensordot(np.array(randos).reshape(-1,4),lookup, axes=((-1),(0)))
    ...: out2 = np.tensordot(randos,lookup_arr,axes=((-2,-1),(0,1)))
    ...: 

In [76]: np.allclose(out1, out2)
Out[76]: True

In [77]: %timeit np.tensordot(np.array(randos).reshape(-1,4),\
                                      lookup, axes=((-1),(0)))
10000 loops, best of 3: 37 µs per loop

In [78]: %timeit np.tensordot(randos,lookup_arr,axes=((-2,-1),(0,1)))
10000 loops, best of 3: 33.3 µs per loop

In [79]: %timeit np.asarray(lookup).reshape(2,2,3,3,3)
100000 loops, best of 3: 2.18 µs per loop

Anyway, to make a numpy array operation faster?

More articles: