When searching for a model using a carriage

I use the carriage train function to train SVM using the svmRadial kernel for the binary classification problem that I have.

When I run the train function on my data, I gradually add these messages that say

line search fails -2.13865 -0.1759025 1.01927e-05 3.812143e-06 -5.240749e-08 -1.810113e-08 -6.03178e-13line search fails -0.7148131 0.1612894 2.32937e-05 3.518543e-06 -1.821269e-08 -1.37704e-08 -4.726926e-13

As soon as the code is finished (compilation / run?), I also received the following warnings:

    > warnings()
Warning messages:
1: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
2: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
3: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
4: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
5: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
6: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
7: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
8: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
9: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
10: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
11: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
12: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
13: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
14: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
15: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
16: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
17: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
18: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
19: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
20: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
21: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
22: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
23: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
24: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
25: In method$predict(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class prediction calculations failed; returning NAs
26: In method$prob(modelFit = modelFit, newdata = newdata,  ... :
  kernlab class probability calculations failed; returning NAs
27: In data.frame(..., check.names = FALSE) :
  row names were found from a short variable and have been discarded
28: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,  ... :
  There were missing values in resampled performance measures.

As can be seen from the above warnings, mentioning the NA values ​​for some probabilistic calculations, why will these calculations fail?

As requested by @HFBrowning, an example of the data I use is given. I am trying to build a classifier to predict whether a telecommunications cell is either overload or inoperative (class).

> head(imbal_training,10)
   Total.Tx.Height Antenna.Tilt Antenna.Gain Ant.Vert.Beamwidth       RTWP Voice.Drops Range Max.Distance Rural Suburban Urban
2            31.25            0         15.9               10.0 -103.55396          12  5.14         6.24     1        0     0
5            31.25            0         18.2                4.4 -104.76192           1  3.88         4.98     1        0     0
7            25.14            4         15.9                9.6 -102.93839           1  6.58         9.17     1        0     0
9            25.14            2         18.8                4.3 -104.23198           4  5.08         7.67     1        0     0
11           10.66            4         16.2               10.0  -98.23691          17 23.33        24.69     0        1     0
12           10.66            6         16.2               10.0 -103.78522           5 18.24        19.60     0        1     0
13           10.66            5         16.2               10.0  -94.59940           5 20.20        21.56     0        1     0
14           10.66            3         18.7                4.4 -103.17622           3 23.86        25.22     0        1     0
15           10.66            5         18.7                4.4 -104.97827           0 23.86        25.22     0        1     0
16           10.66            4         18.8                4.4 -105.78948           1 23.86        25.22     0        1     0
              Class HSUPA.Throughput Max.HSDPA.Users HS.DSCH.throughput Max.HSUPA.Users Avg.CQI
2  Not.Overshooting           222.62              16            2345.54              25   17.99
5      Overshooting           263.83               8            3894.07              13   21.82
7      Overshooting           392.66              14            5134.80              15   23.00
9      Overshooting           478.58               8            7203.39               8   24.70
11     Overshooting           173.21              11            2429.06              15   23.51
12     Overshooting           210.61              16            2694.93              20   19.76
13     Overshooting           205.81              11            3278.06              13   22.10
14     Overshooting           394.10              10            3881.88              13   25.01
15     Overshooting           371.71              10            3765.10              13   23.33
16     Overshooting           321.32               6            4422.15               8   24.85

Here is the code to control my train:

#run the algorithms using 10 fold cross validation
set.seed(123)
train_Control <- trainControl(method = "repeatedCV", 
                              number = 10, 
                              repeats = 3,
                              savePredictions = T,
                              classProbs = T, #required for the ROC curve calcs
                              summaryFunction = twoClassSummary) #uses AUC to pick the best model

And here is my train function:

 #uses the rose_training dataset with a kernel model
set.seed(123)
fit.rose.Kernel <- train(Class ~ Total.Tx.Height +
                         Antenna.Tilt +
                         Antenna.Gain +
                         Ant.Vert.Beamwidth +
                         RTWP +
                         Voice.Drops +
                         Range +
                         Max.Distance +
                         Rural +
                         Suburban +
                         Urban +
                         HSUPA.Throughput +
                         Max.HSDPA.Users +
                         HS.DSCH.throughput + 
                         Max.HSUPA.Users +
                         Avg.CQI, 
                       data = rose_train,
                       method = 'svmRadial',
                       preProcess = c('center','scale'),
                       trControl=train_Control,
                       tuneLength=15,
                       metric = "ROC")

, , , , .

444 469 , - . , , , .

2 , , SVM svmLinear smvRadial.

, "" (~ 80/20). , , , SMOTE ROSE , .

- , ?

, dput- , . , , 444.

- , .

+6

Source: https://habr.com/ru/post/1016373/


All Articles