I see verbose questions and answers saying that NLTK collocation cannot be performed outside of two and three grams.
example of this -
How to get n-gram collocations and associations in python nltk?
I see that there is something called
nltk.QuadgramCollocationFinder
Similarly
nltk.BigramCollocationFinder and nltk.TrigramCollocationFinder
But at the same time canโt see anything like
nltk.collocations.QuadgramAssocMeasures ()
similar to nltk.collocations.BigramAssocMeasures () and nltk.collocations.TrigramAssocMeasures ()
What is the purpose of nltk.QuadgramCollocationFinder if it is not possible (without hacks) to find n-grams outside bi and trigrams.
Maybe I missed something.
Thanks,
Alvas,
import nltk
from nltk.collocations import *
from nltk.corpus import PlaintextCorpusReader
from nltk.metrics.association import QuadgramAssocMeasures
bigram_measures = nltk.collocations.BigramAssocMeasures()
trigram_measures = nltk.collocations.TrigramAssocMeasures()
quadgram_measures = QuadgramAssocMeasures()
the_filter = lambda *w: 'crazy' not in w
finder = BigramCollocationFinder.from_words(corpus)
finder.apply_freq_filter(3)
finder.apply_ngram_filter(the_filter)
print (finder.nbest(bigram_measures.likelihood_ratio, 10))
finder = QuadgramCollocationFinder.from_words(corpus)
finder.apply_freq_filter(3)
finder.apply_ngram_filter(the_filter)
print(finder.nbest(quadgram_measures.likelihood_ratio,10))