Python: Sklearn.linear_model.LinearRegression works weird

Question

Python: Sklearn.linear_model.LinearRegression works weird

I am trying to make several linear regression variables. But I think sklearn.linear_model works very weird. Here is my code:

import numpy as np
from sklearn import linear_model

b = np.array([3,5,7]).transpose() ## the right answer I am expecting
x = np.array([[1,6,9],   ## 1*3 + 6*5 + 7*9 = 96
              [2,7,7],   ## 2*3 + 7*5 + 7*7 = 90
              [3,4,5]])  ## 3*3 + 4*5 + 5*7 = 64
y = np.array([96,90,64]).transpose()

clf = linear_model.LinearRegression()
clf.fit([[1,6,9],
         [2,7,7],
         [3,4,5]], [96,90,64])
print clf.coef_ ## <== it gives me [-2.2  5  4.4] NOT [3, 5, 7]
print np.dot(x, clf.coef_) ## <== it gives me [ 67.4  61.4  35.4]

+4

python scikit-learn

Macshanhe Jun 24 '14 at 18:13

source share

1 answer

eickenberg · Accepted Answer · 2014-06-24T19:43:33+0000

To find the original coefficients back, you need to use the keyword fit_intercept=Falsewhen building linear regression.

import numpy as np
from sklearn import linear_model

b = np.array([3,5,7])
x = np.array([[1,6,9],  
              [2,7,7],   
              [3,4,5]])  
y = np.array([96,90,64])

clf = linear_model.LinearRegression(fit_intercept=False)
clf.fit(x, y)
print clf.coef_
print np.dot(x, clf.coef_)

Using fit_intercept=Falseprevents the object LinearRegressionfrom working x - x.mean(axis=0), which otherwise it would execute (and fix the average using a constant offset y = xb + c)) or equivalent by adding a column 1to x.

, transpose 1D- ( , ).

Python: Sklearn.linear_model.LinearRegression works weird

More articles: