SVM - hard or soft fields?

Given a linearly shared dataset, is it better to use hard-edge SVM over soft-margin SVM?

+54
algorithm machine-learning svm
Jan 07 2018-11-11T00:
source share
2 answers

I would expect a soft-field SVM to be better, even when the training dataset is linearly split. The reason is that in a hard-field SVM, a single outlier can define a boundary, which makes the classifier overly sensitive to noise in the data.

In the diagram below, one red spike essentially defines a boundary that is a hallmark of refitting

enter image description here

To understand what SVM with a soft field does, it’s better to look at it in a double wording where you can see that it has the same goal of maximizing margins (margin can be negative) as SVM with a hard field, but with an additional restriction that each Lagrange multiplier associated with the support vector is bounded by C. Essentially, this limits the influence of any single point on the boundary of the solution for derivation, see Proposition 6.12 in Cristianini / Shaw-Taylor “Introduction to the support vector”. Machines and other core-based learning methods. "

As a result, a soft-field SVM can choose a decision boundary that has a non-zero learning error, even if the data set is linearly separable and less likely to fit.

Here is an example using libSVM on a synthetic issue. Dotted circles indicate support vectors. You can see that decreasing C causes the classifier to sacrifice linear separability in order to achieve stability, in the sense that the influence of any single data element is now limited to C.

enter image description here

The value of the support vectors:

For hard-field SVMs, reference vectors are points that are “in the margins”. In the figure above, C = 1000 is pretty close to a hard-field SVM, and you can see that the points that will touch the field are indicated by circles (in this figure, the margin is almost 0, so it is essentially the same as the separator hyperplanes). )

For a soft field, SVMs are easier to explain in terms of binary variables. Your predictor of support vectors in terms of double variables is the following function.

enter image description here

Here alphas and b are the parameters that can be found during the training procedure, xi, yi is your training set, and x is the new data point. Auxiliary vectors are data points from the training set that are included in the predictor, i.e. Those that have a nonzero alpha parameter.

+120
Jan 07 2018-11-11T00:
source share

In my opinion, SVM's tight margin overestimates a particular data set and therefore cannot generalize. Even in a linearly shared dataset (as shown in the diagram above), outliers within boundaries can affect margins. SVM Soft Margin is very versatile because we control the selection of support vectors by customizing C.

+3
Oct 15 '13 at 1:23
source share



All Articles