Sklearn metrics.log_loss is positive compared to scoring "neg_log_loss" is negative

Making sure I'm right:

If we use sklearn.metrics.log_loss autonomous, i.e. log_loss (y_true, y_pred), it generates a positive result - the lower the score, the better the performance.

However, if we use 'neg_log_loss' as a scoring scheme, as in 'cross_val_score', the rating is negative - the higher the rating, the better the performance.

And this is due to the fact that the scoring scheme is designed to fit other scoring schemes. Since in general, the higher the better, we deny the usual log_loss so that it matches the trend. And this is done solely for this purpose. Is this understanding correct?

[Background: positive ratings obtained for metric.log_loss and negative ratings for "neg_los_loss", and both refer to the same documentation page.]

+10
source share
1 answer

sklearn.metrics.log_loss is an implementation of the error metric, as it is usually defined, and which, like most error metrics, is a positive number. In this case, this is an indicator that is usually minimized (for example, as the standard error of the regression), in contrast to indicators, such as accuracy, which is maximized.

Therefore, neg_log_loss is a technical component for creating a utility value that allows you to optimize sklearn functions and classes to maximize this utility without having to change the function behavior for each metric (for example, with the names cross_val_score , GridSearchCV , RandomizedSearchCV and others).

+4
source

Source: https://habr.com/ru/post/1266047/


All Articles