Regular numerical linear regression

Question

Regular numerical linear regression

I don't see what is wrong with my code for regularized linear regression. The wrong thing I have is just that I'm pretty sure of the right one:

import numpy as np def get_model(features, labels): return np.linalg.pinv(features).dot(labels)

Here is my code for a regularized solution where I don't see what is wrong with it:

 def get_model(features, labels, lamb=0.0): n_cols = features.shape[1] return linalg.inv(features.transpose().dot(features) + lamb * np.identity(n_cols))\ .dot(features.transpose()).dot(labels)

With a default value of 0.0 for the lamb, I assume that it should give the same result as the (correct) irregular version, but the difference is actually quite large.

Does anyone see what the problem is?

+5

python numpy machine-learning linear-regression

Marshall farrier Dec 15 '14 at 3:26

source share

1 answer

nullas · Accepted Answer · 2014-12-15T05:30:02+0000

The problem is this:

features.transpose().dot(features) cannot be reversible. And numpy.linalg.inv only works for a full-sized matrix as per docs. However, the (non-zero) regularization term always makes the equation non-degenerate.

By the way, you are right in the implementation. But it is not effective. An effective way to solve this equation is to use the least squares method.

np.linalg.lstsq(features, labels) can do the job for np.linalg.pinv(features).dot(labels) .

In general, you can do this

 def get_model(A, y, lamb=0): n_col = A.shape[1] return np.linalg.lstsq(ATdot(A) + lamb * np.identity(n_col), ATdot(y))

Regular numerical linear regression

More articles: