GBM classification with card

When using the carriage training function to match the GBM classification models, the function prediction function converts probabilistic predictions into factors based on a probability threshold of 0.5.

      out <- ifelse(gbmProb >= .5, modelFit$obsLevels[1], modelFit$obsLevels[2])
      ## to correspond to gbmClasses definition above

This conversion seems premature if the user tries to maximize the area under the ROC curve (AUROC). Although sensitivity and specificity correspond to a single threshold of probability (and therefore require prediction of factors), I would prefer that AUROC be calculated using the raw probabilistic derivation from gbmPredict. In my experience, I rarely cared about calibrating the classification model; I want the most informative model available, regardless of the probability threshold at which the model predicts “1” versus “0”. Is it possible to force input probabilities into the AUROC calculation? This seems complicated, since any summary function is used; predictions that are already binary are transmitted.

0
source share
1 answer

"since any summary function is used, predictions that are already binary are transmitted"

This is definitely not the case.

It cannot use classes to compute the ROC curve (unless you do so). See note below.

train can predict classes as factors (using the internal code that you show) and / or class probabilities.

For example, this code will calculate the probabilities of classes and use them to get the area under the ROC curve:

library(caret)
library(mlbench)
data(Sonar)

ctrl <- trainControl(method = "cv", 
                     summaryFunction = twoClassSummary, 
                     classProbs = TRUE)
set.seed(1)
gbmTune <- train(Class ~ ., data = Sonar,
                 method = "gbm",
                 metric = "ROC",
                 verbose = FALSE,                    
                 trControl = ctrl)

In fact, if you omit the bit classProbs = TRUE, you will get an error message:

train() use of ROC codes requires class probabilities. See the classProbs option of trainControl()

Max

+4
source

Source: https://habr.com/ru/post/1543556/


All Articles