Smoothing in python NLTK

I am using the Naive Bayes classifier in python to classify text. Are there any smoothing methods to avoid zero chance for invisible words in python NLTK? Thanks in advance!

+4
source share
1 answer

I would suggest replacing all words with a low (especially 1) frequency with <unseen> , and then prepare a classifier in this data. For classification, you must query the model for <unseen> in the case of a word that is not in the training data.

+2
source

Source: https://habr.com/ru/post/1445696/


All Articles