I would say that the “right” way is to use SVD, look at your singular range of values and find out how many special values you want to keep, i.e. find out how close you want A^T xfor b. Something like that:
def svd_solve(a, b):
[U, s, Vt] = la.svd(a, full_matrices=False)
r = max(np.where(s >= 1e-12)[0])
temp = np.dot(U[:, :r].T, b) / s[:r]
return np.dot(Vt[:r, :].T, temp)
However, for a size matrix (100000, 500)this will be too slow. I would recommend implementing the least squares yourself and adding a little regularization to avoid the appearance of a single matrix.
def naive_solve(a, b, lamda):
return la.solve(np.dot(a.T, a) + lamda * np.identity(a.shape[1]),
np.dot(a.T, b))
def pos_solve(a, b, lamda):
return la.solve(np.dot(a.T, a) + lamda * np.identity(a.shape[1]),
np.dot(a.T, b), assume_a='pos')
Here's a temporary analysis on my workstation *:
>>> %timeit la.lstsq(a, b)
1.84 s ± 39.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
>>> %timeit naive_solve(a, b, 1e-25)
140 ms ± 4.15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
>>> %timeit pos_solve(a, b, 1e-25)
135 ms ± 768 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
* I somehow didn’t have it scipy.sparse.linalg.lsmrin my car, so I could not compare with this.
, , , , assume_a='pos' . , , , A^T A , lamda . , lamda, .
:
>>> xhat_lstsq = la.lstsq(a, b)[0]
>>> la.norm(np.dot(a, xhat_lstsq) - b)
1.4628232073579952e-13
>>> xhat_naive = naive_solve(a, b, 1e-25)
>>> la.norm(np.dot(a, xhat_naive) - b)
7.474566255470176e-13
>>> xhat_pos = pos_solve(a, b, 1e-25)
>>> la.norm(np.dot(a, xhat_pos) - b)
7.476075564322223e-13
PS: a a a b :
s = np.logspace(1, -20, 500)
u = np.random.randn(100000, 500)
u /= la.norm(u, axis=0)[np.newaxis, :]
a = np.dot(u, np.diag(s))
x = np.random.randn(500)
b = np.dot(a, x)
My a , .
, . , , . 100 000 A , 500, , , (- ), . SVD , . : - . , A r, r , , r << n.
"" "" . , , . - . "", x - - , , x , .. , x , , , , - b. , .