How to perform POS marking using SVM in Python?

I want to do POS marking with SVM with non-English corpus in Python. It seems that Python does not yet support tagging with SVM ( http://www.nltk.org/_modules ).

scikit-learn has an SVM module. So I installed scikit-learn and used it in Python, but I can not find any tutorials on POS marking using SVM.

I really don't know what to do, any help would be appreciated.

+4
source share
1 answer

Should there be an SVM? NTLK has built-in tools for POS tags: Categorizing and labeling words

If you want to use your own classifier, look here: http://www.nltk.org/api/nltk.classify.html , Ctrl + F "svm", NTLK provides a shell for scikit-learn called SklearnClassifier. Then look at http://www.nltk.org/api/nltk.tag.html , Ctrl + F "classifier", there is a class nltk.tag.sequential.ClassifierBasedPOSTaggerthat, apparently, can use wrapped classifiers from sklearn.

I have not tried this, but it could work.

EDIT: It should work as follows:

from nltk.classify import SklearnClassifier
from sklearn.svm import SVC
clf = SklearnClassifier(SVC(),sparse=False)
cpos = nltk.tag.sequential.ClassifierBasedPOSTagger(train=train_sents,classifier_builder
= lambda train_feats: clf.train(train_feats))

The only problem is that sklearn classifiers only accept numerical functions, so you need to convert them somehow.

+4
source

Source: https://habr.com/ru/post/1606044/


All Articles