Sklearn: Evaluate the performance of each OneVsRestClassifier inside GridSearchCV

I am dealing with multi- OneVsRestClassifier classification with OneVsRestClassifier and SVC ,

 from sklearn.datasets import make_multilabel_classification from sklearn.multiclass import OneVsRestClassifier from sklearn.svm import SVC from sklearn.grid_search import GridSearchCV L=3 X, y = make_multilabel_classification(n_classes=L, n_labels=2, allow_unlabeled=True, random_state=1, return_indicator=True) model_to_set = OneVsRestClassifier(SVC()) parameters = { "estimator__C": [1,2,4,8], "estimator__kernel": ["poly","rbf"], "estimator__degree":[1, 2, 3, 4], } model_tunning = GridSearchCV(model_to_set, param_grid=parameters, scoring='f1') model_tunning.fit(X, y) print model_tunning.best_score_ print model_tunning.best_params_ #0.855175822314 #{'estimator__kernel': 'poly', 'estimator__C': 1, 'estimator__degree': 3} 

1st question

What is the value of 0.85 to represent? Is this the best indicator among classifiers L or averaged? In the same way, does the parameter set match the top scorer among the L classifiers?

Second question

Based on the fact that, if I'm right, OneVsRestClassifier literally builds L classifiers for each label, you can expect access or monitoring the performance of EACH LABEL. But how in the above example to get L scores from the GridSearchCV object?

EDIT

To simplify the task and help yourself learn more about OneVsRestClassifier , before setting up the model,

 model_to_set.fit(X,y) gp = model_to_set.predict(X) # the "global" prediction fp = model_to_set.estimators_[0].predict(X) # the first-class prediction sp = model_to_set.estimators_[1].predict(X) # the second-class prediction tp = model_to_set.estimators_[2].predict(X) # the third-class prediction 

It can be shown that gp.T[0]==fp , gp.T[1]==sp and gp.T[2]==tp . Thus, a โ€œglobalโ€ prediction is simply โ€œconsecutiveโ€ L individual predictions and the 2nd question is being solved .

But it still confuses me that if one OneVsRestClassifier metaclassifier contains L classifiers, how GridSearchCV return only one best result corresponding to one of the 4 * 2 * 4 parameter sets, for the OneVsRestClassifier metaclassifier that has L classifiers?

It would be nice to see any comments.

+5
source share
3 answers

GridSearchCV creates a grid from your parameter values, evaluates your OneVsRestClassifier as an atomic classifier (Ie GridSearchCV does not know what is inside this metaclassifier)

First : 0.85 is the best OneVsRestClassifier result among all possible combinations (16 combinations in your case, 4 * 2 * 4) parameters ("estimator__C", "estimator__kernel", "estimator__degree") , this means that GridSearchCV evaluates 16 (again , only in this particular case) possible OneVsRestClassifier , each of which contains L SVC . All these L-classifiers inside the same OneVsRestClassifier have the same parameter values โ€‹โ€‹(but each of them learns to recognize its own class from L)

i.e. from the set

 {OneVsRestClassifier(SVC(C=1, kernel="poly", degree=1)), OneVsRestClassifier(SVC(C=1, kernel="poly", degree=2)), ..., OneVsRestClassifier(SVC(C=8, kernel="rbf", degree=3)), OneVsRestClassifier(SVC(C=8, kernel="rbf", degree=4))} 

he chooses one with the best result.

model_tunning.best_params_ Here are the parameters for OneVsRestClassifier (SVC ()) with which it will reach model_tunning.best_score_ . You can get the best OneVsRestClassifier from the model_tunning.best_estimator_ attribute.

Second: There is no ready-made code for obtaining separate ratings for L-classifiers from OneVsRestClassifier , but you can look at the implementation of the OneVsRestClassifier.fit method or accept this (should work:)):

 # Here X, y - your dataset one_vs_rest = model_tunning.best_estimator_ yT = one_vs_rest.label_binarizer_.transform(y).toarray().T # Iterate through all L classifiers for classifier, is_ith_class in zip(one_vs_rest.estimators_, yT): print(classifier.score(X, is_ith_class)) 
+4
source

Inspired by @Olologin's answer, I realized that 0.85 is the best weighted average f1 (in this example) obtained by L predictions. In the following code, I evaluate the model using an internal test using the f1 evaluation macro tool:

 # Case A, inspect F1 score using the meta-classifier F_A = f1_score(y, model_tunning.best_estimator_.predict(X), average='macro') # Case B, inspect F1 scores of each label (binary task) and collect them by macro average F_B = [] for label, clc in zip(yT, model_tunning.best_estimator_.estimators_): F_B.append(f1_score(label, clf.predict(X))) F_B = mean(F_B) F_A==F_B # True 

Thus, this means that GridSearchCV uses one of 4 * 2 * 4 parameter sets to construct a metaclassifier, which, in turn, makes a prediction on each label with one of the L classifiers. The result will be L f1 points for L shortcuts, each of which is the performance of a binary task. Finally, one point is obtained by averaging (the macro or weighted average value specified by the parameter in f1_score) from the estimates of L f1.

Then GridSearchCV select the best averaged estimates f1 among 4 * 2 * 4 sets of parameters, which in this example are 0.85.

Although it is convenient to use the shell for a problem with several labels, it can only maximize the average estimate of f1 with the same set of parameters that is used to construct the L classifiers. If you want to optimize the performance of each shortcut separately, you seem to have to create L classifiers without using a wrapper.

+3
source

As for the second question, you can use GridSearchCV with the scikit-multilearn BinaryRelevance classifier. Like OneVsRestClassifier , Binary Relevance creates L one-key classifiers, one per label. For each label, the training data is 1 if the label is present, and 0 if not. The best selected set of classifiers is an instance of the BinaryRelevance class in the best_estimator_ property of GridSearchCV . Use for predicting probability floats uses the predict_proba method of the predict_proba object. An example can be found in scikit-multilearn docs for model selection .

In your case, I will run the following code:

 from skmultilearn.problem_transform import BinaryRelevance from sklearn.model_selection import GridSearchCV import sklearn.metrics model_to_set = BinaryRelevance(SVC()) parameters = { "classifier__estimator__C": [1,2,4,8], "classifier__estimator__kernel": ["poly","rbf"], "classifier__estimator__degree":[1, 2, 3, 4], } model_tunning = GridSearchCV(model_to_set, param_grid=parameters, scoring='f1') model_tunning.fit(X, y) # for some X_test testing set predictions = model_tunning.best_estimator_.predict(X_test) # average=None gives per label score metrics.f1_score(y_test, predictions, average = None) 

Note that there are much better classification methods with multiple labels than binary relevance. You can find them in a comparison of majars or my recent article .

+1
source

Source: https://habr.com/ru/post/1236241/


All Articles