Efficient NumPy Least Squares Computation

Question

Efficient NumPy Least Squares Computation

I need to solve a large set of linear systems in the sense of least squares. I am having trouble understanding the difference in computational performance of numpy.linalg.lstsq(a, b) , np.dot(np.linalg.pinv(a), b) and the mathematical implementation.

I use the following matrices:

 h=np.random.random((50000,100)) a=h[:,:-1].copy() b=-h[:,-1].copy()

and algorithm results:

 # mathematical implementation %%timeit np.dot(np.dot(np.linalg.inv(np.dot(aT,a)),aT),b)

10 cycles, best 3: 36.3 ms per cycle

 # numpy.linalg.lstsq implementation %%timeit np.linalg.lstsq(a, b)[0]

10 loops, best 3: 103 ms per cycle

 %%timeit np.dot(np.linalg.pinv(a), b)

1, best 3: 216 ms per cycle

Why is there a difference?

+5

python numpy least-squares

blaz Jan 14 '17 at 8:48

source share

1 answer

SFTP · Accepted Answer · 2017-01-14T18:51:02+0000

The lstsq routine processes any system: overridden, underdetermined, or well defined. Its output is what you get from pinv (a) * b, but it is faster than calculating the pseudo-inverse. That's why:

General recommendations: do not calculate the inverse matrix if you do not need it. Solving a system for a particular right-hand side is faster than inverting its matrix.

However, your approach with the solution ^T a = a ^T b is faster, even if you invert the matrix. What gives? The fact is that inverting a ^T a is valid only when a has the full rank of the column. Thus, you limited the problem to this particular situation and got speed as a compromise for less generality and, as I will show below, for less security.

But matrix inversion is still inefficient. If you know that a has the full rank of a column, the following is faster than any of your three attempts:

 np.linalg.solve(np.dot(aT, a), np.dot(aT, b))

However, lstsq is still preferable above when it comes to weakly conditional matrices. Product formation a ^T basically squares the condition number, so you are more likely to get meaningless results. Here's an example with caution using the SciPy linalg module (which is essentially equivalent to NumPy but has more methods):

 import numpy as np import scipy.linalg as sl a = sl.hilbert(10) # a poorly conditioned square matrix of size 10 b = np.arange(10) # right hand side sol1 = sl.solve(a, b) sol2 = sl.lstsq(a, b)[0] sol3 = sl.solve(np.dot(aT, a), np.dot(aT, b))

Here lstsq gives almost the same result as solve (the only solution to this system). However, sol3 is completely wrong due to numerical problems (which you won't even be warned about).

sol1:

  [ -9.89821788e+02, 9.70047434e+04, -2.30439738e+06, 2.30601241e+07, -1.19805858e+08, 3.55637424e+08, -6.25523002e+08, 6.44058066e+08, -3.58346765e+08, 8.31333426e+07]

sol2:

  [ -9.89864366e+02, 9.70082635e+04, -2.30446978e+06, 2.30607638e+07, -1.19808838e+08, 3.55645452e+08, -6.25535946e+08, 6.44070387e+08, -3.58353147e+08, 8.31347297e+07]

sol3:

  [ 1.06913852e+03, -4.61691763e+04, 4.83968833e+05, -2.08929571e+06, 4.55280530e+06, -5.88433943e+06, 5.92025910e+06, -5.56507455e+06, 3.62262620e+06, -9.94523917e+05]

Efficient NumPy Least Squares Computation

More articles: