I am trying to combine the results of a method predictwith the source data in an object pandas.DataFrame.
from sklearn.datasets import load_iris
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
import pandas as pd
import numpy as np
data = load_iris()
df = pd.DataFrame(data = data.data)
df['class'] = data.target
X = np.matrix(df.loc[:, [0, 1, 2, 3]])
y = np.array(df['class'])
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 0.8)
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
y_hats = model.predict(X_test)
To combine these predictions with the original df, I try this:
df['y_hats'] = y_hats
But it increases:
ValueError: the length of the values does not match the length of the index
, df train_df test_df, , , X y (my , ). df, y_hats , , , X_test y_test, ? , ? , train, np.nan .