SMO optimization with RBFKernel (C and gamma)

When using RBF cores with support for vector machines, there are two parameters: C and γ. It is not known in advance which C and γ are the best for one problem; therefore, some model selection (parameter search) must be made. The goal is to determine the good (C; γ) so that the teller can accurately predict unknown data (i.e. test data).

weka.classifiers.meta.GridSearch is a meta classifier for setting a pair of parameters. It seems, however, that age is required to complete (when the data set is quite large). What do you propose to do to reduce the time required to complete this task?

According to the User Guide for vector machine support :

C: constant constant. A lower value of C allows you to ignore points close to the border, and increases the margin.

γ> 0 is the parameter that controls the width of the Gaussian

+4
source share
1 answer

Hastie et al. SVMPath explores the entire regularization path for C and requires only the same computational cost of training a single SVM model. From their article:

Our R-function SvmPath calculates all 632 steps in the example mixture (n + = n- = 100, radial core, γ = 1) in 1.44 (0.02) seconds on a pentium 4, 2Ghz linux machine; The svm function (using the optimized libsvm code from the R e1071 library) takes 9.28 (0.06) seconds to calculate the solution at 10 points along the path. Consequently, our procedure is 50% longer to calculate the whole path than it costs libsvm to calculate a typical one solution.

They released an implementation of the GPLed algorithm in R, which you can download from CRAN here .

Using SVMPath should allow you to quickly find a good C value for any given γ. However, you still have to perform separate training runs for different γ values. But this should be much faster than performing separate runs for each pair of C: γ values.

+3
source

Source: https://habr.com/ru/post/1303714/


All Articles