I did not try to do an unprepared mood analysis, for example, you describe, but, from my head, I would say that you simplify the problem. Simply analyzing adjectives is not enough to get a good idea of the mood of the text; for example, consider the word "stupid." Alone, you classify it as negative, but if in the product review should be "... [x] the product makes its competitors stupid, because at first it does not think about this function ...", then the mood there will certainly be positive, Great context in which the words appear, definitely matters in something like that. That is why one of the unprepared word bags alone (not to mention even more limited adjective bags) is not enough to adequately solve this problem.
Preclassified data (“learning data”) helps in that the problem shifts from trying to determine if the text has a positive or negative feeling from scratch, in order to determine if the text is more like positive texts or negative texts, and classify it as such way. Another important point is that text analyzes, such as mood analysis, are often highly dependent on differences in the characteristics of texts depending on the domain. This is why having a good training dataset (i.e. accurate data from the domain in which you work, and hopefully presenting the texts you need to classify) is just as important as creating a good system for classifying with.
Not quite an article, but hope this helps.
waffle paradox Oct. 13 '10 at 6:35 2010-10-13 06:35
source share