How to optimize the sklearn pipeline using XGboost for another `eval_metric`?

Question

How to optimize the sklearn pipeline using XGboost for another `eval_metric`?

I am trying to use XGBoost and optimize eval_metrichow auc(as described here ).

This works fine when using the classifier directly, but fails when I try to use it as a pipeline .

What is the correct way to pass an argument .fitto the sklearn pipeline?

Example:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris
from xgboost import XGBClassifier
import xgboost
import sklearn

print('sklearn version: %s' % sklearn.__version__)
print('xgboost version: %s' % xgboost.__version__)

X, y = load_iris(return_X_y=True)

# Without using the pipeline: 
xgb = XGBClassifier()
xgb.fit(X, y, eval_metric='auc')  # works fine

# Making a pipeline with this classifier and a scaler:
pipe = Pipeline([('scaler', StandardScaler()), ('classifier', XGBClassifier())])

# using the pipeline, but not optimizing for 'auc': 
pipe.fit(X, y)  # works fine

# however this does not work (even after correcting the underscores): 
pipe.fit(X, y, classifier__eval_metric='auc')  # fails

Error:
TypeError: before_fit() got an unexpected keyword argument 'classifier__eval_metric'

As for the xgboost version:
xgboost.__version__shows 0.6
pip3 freeze | grep xgboostshows xgboost==0.6a2.

+4

python scikit-learn classification pipeline xgboost

sapo_cosmico Mar 14 '17 at 18:23

source share

1 answer

Vivek Kumar · Answer 1 · 2017-03-15T02:21:42+0000

, . .

Pipeline.fit() , :

, , , p s s__p.

, :

pipe.fit(X_train, y_train, classifier__eval_metric='auc')

( )

How to optimize the sklearn pipeline using XGboost for another `eval_metric`?

More articles: