Simple one-vector input arrays are considered incompatible by scikit

I have two variables originally from the same pandas df. I am extracting one from TT and the other at t. I use TT to predict t, which is binary. I cannot determine why variables are treated as having incompatible forms using scikit. I applied TT as a fix, but that didn't work.

>>> TT=adf.x1.values
>>> t=adf.y.values
>>> TT.shape
(2856L,)
>>> t.shape
(2856L,)
>>> TT
array([ 4.43081665,  5.99146461,  4.86753464, ...,  4.58496761,
        8.4553175 ,  7.37775898], dtype=float32)
>>> t
array([ 0.,  0.,  0., ...,  0.,  0.,  0.], dtype=float32)
>>> clf=LogisticRegression(C=1)   
>>> clf.fit(TT,t)
Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:...\sklearn\svm\base.py", line 686, in fit
        (X.shape[0], y.shape[0]))
ValueError: X and y have incompatible shapes.
X has 1 samples, but y has 2856.)
+4
source share
1 answer

If you look at the documentation on sklearn.linear_model.LogisticRegression.fit,

  • TTmust be shaped (n_samples, n_features)and
  • tmust have a form (n_samples).

TT 2D-. TT, (2856L, 1), TT.reshape(-1, 1) , , , , .

+5

Source: https://habr.com/ru/post/1523625/


All Articles