How to perform POS marking using SVM in Python?

Question

How to perform POS marking using SVM in Python?

I want to do POS marking with SVM with non-English corpus in Python. It seems that Python does not yet support tagging with SVM ( http://www.nltk.org/_modules ).

scikit-learn has an SVM module. So I installed scikit-learn and used it in Python, but I can not find any tutorials on POS marking using SVM.

I really don't know what to do, any help would be appreciated.

+4

python scikit-learn svm nltk pos-tagger

Sam black Sep 05 '15 at 9:53

source share

1 answer

hellpanderr · Answer 1 · 2015-09-05T12:55:44+0000

Should there be an SVM? NTLK has built-in tools for POS tags: Categorizing and labeling words

If you want to use your own classifier, look here: http://www.nltk.org/api/nltk.classify.html , Ctrl + F "svm", NTLK provides a shell for scikit-learn called SklearnClassifier. Then look at http://www.nltk.org/api/nltk.tag.html , Ctrl + F "classifier", there is a class nltk.tag.sequential.ClassifierBasedPOSTaggerthat, apparently, can use wrapped classifiers from sklearn.

I have not tried this, but it could work.

EDIT: It should work as follows:

from nltk.classify import SklearnClassifier
from sklearn.svm import SVC
clf = SklearnClassifier(SVC(),sparse=False)
cpos = nltk.tag.sequential.ClassifierBasedPOSTagger(train=train_sents,classifier_builder
= lambda train_feats: clf.train(train_feats))

The only problem is that sklearn classifiers only accept numerical functions, so you need to convert them somehow.

How to perform POS marking using SVM in Python?

More articles: