The POS mark - which gives you tags that allow you to look at the tense of the verb - already takes into account the context of the sentence, so it solves your problems again. accuracy through context. In fact, POS tagging doesn't actually work properly with words in and of itself! Look at this example from Ch. 5 from the NLTK book , which, given the context in the sentence, allows NLTK to distinguish nouns from verbs and homonyms (i.e. gives a word like permission, which can have different meanings as a verb and a noun):
Let's look at another example, this time containing some homonyms:
>>> text = nltk.word_tokenize("They refuse to permit us to obtain the refuse permit") >>> nltk.pos_tag(text) [('They', 'PRP'), ('refuse', 'VBP'), ('to', 'TO'), ('permit', 'VB'), ('us', 'PRP'), ('to', 'TO'), ('obtain', 'VB'), ('the', 'DT'), ('refuse', 'NN'), ('permit', 'NN')]
Please note that waiver and resolution may appear as a real verb tense (VBP) and a noun (NN). For instance. refUSE is a verb meaning "deny", while REFuse is a noun meaning "garbage" (i.e. they are not homophones). So we need to know which word is used to pronounce the text correctly. (For this reason, text-to-speech systems typically perform POS tagging.)
source share