Nltk quadgram collocation finder

I see verbose questions and answers saying that NLTK collocation cannot be performed outside of two and three grams.

example of this - How to get n-gram collocations and associations in python nltk?

I see that there is something called

nltk.QuadgramCollocationFinder

Similarly

nltk.BigramCollocationFinder and nltk.TrigramCollocationFinder

But at the same time canโ€™t see anything like

nltk.collocations.QuadgramAssocMeasures ()

similar to nltk.collocations.BigramAssocMeasures () and nltk.collocations.TrigramAssocMeasures ()

What is the purpose of nltk.QuadgramCollocationFinder if it is not possible (without hacks) to find n-grams outside bi and trigrams.

Maybe I missed something.

Thanks,

Alvas,

import nltk
from nltk.collocations import *
from nltk.corpus import PlaintextCorpusReader
from nltk.metrics.association import QuadgramAssocMeasures

bigram_measures = nltk.collocations.BigramAssocMeasures()
trigram_measures = nltk.collocations.TrigramAssocMeasures()
quadgram_measures = QuadgramAssocMeasures()

the_filter = lambda *w: 'crazy' not in w

finder = BigramCollocationFinder.from_words(corpus)
finder.apply_freq_filter(3)
finder.apply_ngram_filter(the_filter)
print (finder.nbest(bigram_measures.likelihood_ratio, 10))


finder = QuadgramCollocationFinder.from_words(corpus)
finder.apply_freq_filter(3)
finder.apply_ngram_filter(the_filter)
print(finder.nbest(quadgram_measures.likelihood_ratio,10))
+4
1

repo:

from nltk.metrics.association import QuadgramAssocMeasures
+2

Source: https://habr.com/ru/post/1619552/


All Articles