What am I doing wrong here? I have a large dataset that I want to perform in a partial fit using Scydit-learn SGDClassifier
I do the following
from sklearn.linear_model import SGDClassifier import pandas as pd chunksize = 5 clf2 = SGDClassifier(loss='log', penalty="l2") for train_df in pd.read_csv("train.csv", chunksize=chunksize, iterator=True): X = train_df[features_columns] Y = train_df["clicked"] clf2.partial_fit(X, Y)
I get an error
Traceback (last last call): File /predict.py, line 48, in sys.exit (0, if main () else 1) File "/predict.py", line 44, basically Predict () File " /predict.py ", line 38, in the forecast clf2.partial_fit (X, Y) File" /Users/anaconda/lib/python3.5/site-packages/sklearn/linear_model/stochastic_gradient.py ", line 512, in partial_fit coef_init = None, intercept_init = None) File "/Users/anaconda/lib/python3.5/site-packages/sklearn/linear_model/stochastic_gradient.py", line 349, in _partial_fit _check_partial_fit_first_call (self Users / classes) anaconda / lib / python3.5 / site-packages / sklearn / utils / multiclass.py ", line 297, in _check_partial_fit_first_call raise ValueError (" classes must be passed on the first call to "ValueError:classes must be passed the first time partial_fit is called.
, , , np.unique(target), target - . , , , ! :
for train_df in pd.read_csv("train.csv", chunksize=chunksize, iterator=True): X = train_df[features_columns] Y = train_df["clicked"] clf2.partial_fit(X, Y, classes=np.unique(Y))
http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html#sklearn.linear_model.SGDClassifier.partial_fit
clf2.partial_fit(X, Y, classes=np.unique(Y))
, , , .
Source: https://habr.com/ru/post/1669313/More articles:using bisect in the list of tuples, but compare using only the first value - pythonHow to implement image compression on the fly using Nginx? - nginxReplace value except end of line - regexSimulate a continuous variable that correlates with an existing binary variable - rReplace loop in python with matlab search equivalent - pythonChoosing strings before and after strings of interest in Pandas - pythonExecuting a timeout error with Fetch - React Native - fetchCreating two-dimensional perling noise with numpy - pythonGoogle Chrome extension using Angular CLI stuck in boot - pluginsRamdaJS reduceBy () in Haskell using recursive schemas - haskellAll Articles