Verb tense identification in python

How can I use Python + NLTK to determine if a sentence is past / present / future?

Can I only do this with POS tags? It seems a little inaccurate, it seems to me that I need to consider the context of the sentence, and not just the words alone.

Any suggestion for another library that can do this?

+3
source share
2 answers

It will not be too difficult to do it yourself. This table should help you identify the different tenses of the verbs and process them, it's just a matter of analyzing the result of nltk.pos_tag (string)

I'm not sure if you want to fall into all periods of an irregular verb, such as “could be,” etc., but if you want only the present / past / future, this is a very simple parsing task.

I do not know a single library that will do this on its own, and I always thought about teaching some model to solve this for me, but I never got to it.

There will be some degree of error, but it will not be big. I recommend parsing all the verbs to decide how you want to handle the time, because in sentences like: I'm glad that he sees it. Tension is present, but there is a future tense reservation ([which] he will see it). Thus, it falls into the linguistics of your problem, which you did not specify, but you understood this idea.

+6
source

The POS mark - which gives you tags that allow you to look at the tense of the verb - already takes into account the context of the sentence, so it solves your problems again. accuracy through context. In fact, POS tagging doesn't actually work properly with words in and of itself! Look at this example from Ch. 5 from the NLTK book , which, given the context in the sentence, allows NLTK to distinguish nouns from verbs and homonyms (i.e. gives a word like permission, which can have different meanings as a verb and a noun):

Let's look at another example, this time containing some homonyms:

>>> text = nltk.word_tokenize("They refuse to permit us to obtain the refuse permit") >>> nltk.pos_tag(text) [('They', 'PRP'), ('refuse', 'VBP'), ('to', 'TO'), ('permit', 'VB'), ('us', 'PRP'), ('to', 'TO'), ('obtain', 'VB'), ('the', 'DT'), ('refuse', 'NN'), ('permit', 'NN')] 

Please note that waiver and resolution may appear as a real verb tense (VBP) and a noun (NN). For instance. refUSE is a verb meaning "deny", while REFuse is a noun meaning "garbage" (i.e. they are not homophones). So we need to know which word is used to pronounce the text correctly. (For this reason, text-to-speech systems typically perform POS tagging.)

+4
source

Source: https://habr.com/ru/post/986456/


All Articles