Use / Application of tags with partial speech (POS marking)

I understand the implicit meaning of partial speech tags and have seen references to its use in parsing, converting text to speech, etc.

Could you tell me how the result of creating a PoS tagger is formed? In addition, could you explain how such a conclusion is used by other tasks / parts of the NLP system?

+6
source share
2 answers

One of the goals of PoS labeling is to eliminate the two-dimensionality of homonyms. For example, take this sentence:

I fish fish

The same sentence in French would be Je pΓͺche un poisson. Without tagging, the fish will be translated equally in both cases, which will lead to an incorrect interpretation. However, after labeling PoS, the offer will

I_PRON fish_VERB a_DET fish_NOUN

From a computer point of view, both words are now different. This wat, they can be processed much more efficiently (in our example fish_VERB will be translated to pΓͺche and fish_NOUN to poisson).

+6
source

Basically, the goal of a POS tag is to assign linguistic (mostly grammatical) information to sub-satellite units. Such units are called tokens and, in most cases, correspond to words and symbols (for example, punctuation).

Given the output format, it doesn't really matter if you get a sequence of token / tag pairs. Some POS tags allow you to specify a specific output format, others use XML or CSV / TSV, etc.

+2
source

Source: https://habr.com/ru/post/970161/


All Articles