Mathematically, SVM optimization is a convex optimization problem, usually with a unique minimizer. This means that there is only one solution to this problem of mathematical optimization.
The differences in the results come from several aspects: SVC and LinearSVC should optimize the same problem, but in fact all liblinear estimates punish the interception, while libsvm do not (IIRC). This leads to a different mathematical optimization problem and, therefore, to different results . There may be other subtle differences, such as scaling and the default loss function (edit: make sure you set loss='hinge' in LinearSVC ). Further, in the classification of multiclasses, liblinear does one-vs-rest by default, while libsvm does one-vs-one.
SGDClassifier(loss='hinge') differs from the other two in the sense that it uses stochastic gradient descent rather than exact gradient descent and cannot converge to the same solution. However, the resulting solution can be better generalized.
Between SVC and LinearSVC one important decision criterion is that LinearSVC tends to converge faster, the larger the number of samples. This is due to the fact that the linear kernel is a special case optimized in Liblinear, but not in Libsvm.
source share