Numpy np.array vs np.matrix (performance)

often when working with numpy, I find the difference annoying - when I pull a vector or a row from a matrix and then perform operations with np.array , problems usually arise.

to reduce headaches, I sometimes only used np.matrix (converting all np.arrays to np.matrix ) just for simplicity. however, I suspect there are some performance implications. can anyone comment on what this could be and why?

it looks like if they are just arrays under the hood, accessing the element is just calculating the offset to get the value, so I'm not sure without reading the whole source what the difference is.

more specifically, what are the performance implications:

 v = np.matrix([1, 2, 3, 4]) # versus the below w = np.array([1, 2, 3, 4]) 

thanks

+6
source share
2 answers

I added some more tests, and it seems that array much faster than matrix when arrays / matrices are small, but the difference is greater for large data structures:

Small:

 In [11]: a = [[1,2,3,4],[5,6,7,8]] In [12]: aa = np.array(a) In [13]: ma = np.matrix(a) In [14]: %timeit aa.sum() 1000000 loops, best of 3: 1.77 us per loop In [15]: %timeit ma.sum() 100000 loops, best of 3: 15.1 us per loop In [16]: %timeit np.dot(aa, aa.T) 1000000 loops, best of 3: 1.72 us per loop In [17]: %timeit ma * ma.T 100000 loops, best of 3: 7.46 us per loop 

Click to enlarge:

 In [19]: aa = np.arange(10000).reshape(100,100) In [20]: ma = np.matrix(aa) In [21]: %timeit aa.sum() 100000 loops, best of 3: 9.18 us per loop In [22]: %timeit ma.sum() 10000 loops, best of 3: 22.9 us per loop In [23]: %timeit np.dot(aa, aa.T) 1000 loops, best of 3: 1.26 ms per loop In [24]: %timeit ma * ma.T 1000 loops, best of 3: 1.24 ms per loop 

Note that matrix multiplication is actually a bit faster.

I believe that what I get here is consistent with what @Jaime explains the comment.

+3
source

There is a general discussion on SciPy.org and this question .

To compare performance, I did the following in iPython. It turns out arrays are much faster.

 In [1]: import numpy as np In [2]: %%timeit ...: v = np.matrix([1, 2, 3, 4]) 100000 loops, best of 3: 16.9 us per loop In [3]: %%timeit ...: w = np.array([1, 2, 3, 4]) 100000 loops, best of 3: 7.54 us per loop 

Consequently, numpy arrays have better performance than numpy matrices.

Used Versions:

Numpy: 1.7.1

IPython: 0.13.2

Python: 2.7

+5
source

Source: https://habr.com/ru/post/946551/


All Articles