Support vector machine vs K Nearest neighbors

Question

Support vector machine vs K Nearest neighbors

I have a dataset for classification. Using KNN algo, I get 90% accuracy, but when using SVM I can get more than 70%. Is SVM no better than KNN. I know this might be silly to ask, but what are the options for SVM that will give almost rough results like KNN algo. I am using libsvm package on matlab R2008

+6

matlab machine-learning libsvm

Mohit jain Oct 17 '13 at 8:43

source share

3 answers

It depends on the data set you are using. If you have something like the first line of this image ( http://scikit-learn.org/stable/_images/plot_classifier_comparison_1.png ), kNN will work very well, and linear SVM will work very poorly.

If you want SVM to work better, you can use kernel-based SVM, as in the picture (it uses the rbf kernel).

If you are using scikit-learn for python, you can play a bit with the code here to see how to use the SVM kernels http://scikit-learn.org/stable/modules/svm.html

+6

Adriennk Oct 17 '13 at 9:01

source share

kNN basically says: "If you are close to the x coordinate, then the classification will be similar to the observed results at x." In SVM, a close analogue will use a high-dimensional core with a "small" bandwidth parameter, as this will cause SVM to process more. That is, the SVM will be closer to "if you are close to the x coordinate, then the classification will be similar to the classification observed at x".

I recommend that you start with the Gaussian kernel and check the results for different parameters. From my own experience (which, of course, focuses on certain types of data sets, so your mileage may vary), a customized SVM is superior to a customized kNN.

Questions for you:

1) How do you choose k in kNN?

2) What parameters did you try to use for SVM?

3) Do you measure accuracy in or out of the sample?

+5

Max Oct 17 '13 at 13:15

source share

Shai · Accepted Answer · 2013-10-17T08:56:58+0000

kNN and SVM represent different learning approaches. Each approach implies a different model for the underlying data.

SVM assumes a hyperplane separating data points (a very restrictive assumption), while kNN tries to approximate the underlying data distribution in a nonparametric way (a rough approximation of the parsen window estimate).

You will need to study the specifics of your scenario in order to better decide which algorithm and configuration is best used.

Support vector machine vs K Nearest neighbors

More articles: