What tools do you use to analyze text?

I need inspiration. For a hobby project, I play with content analysis. I am basically trying to parse the input to match it with a thematic map.

For example:

  • Way to Iraq> History, Middle East
  • "Halloumni"> Food, Middle East
  • "BMW"> Germany, Cars
  • Obama> USA
  • "Impala"> USA, Cars
  • The Berlin Wall> History, Germany
  • "Bratwurst"> Food, Germany
  • Cheeseburger> Food, USA
  • ...

I read a lot about taxonomy, and, in the end, everything I read concludes that all people mark in different ways, and therefore the system will fail.

, , , , . , - , , , . .

, - , , . .

, , -, , - - , ?

+3
3

. OpenCalais . , ( Reuters).

, . , , - WordNet , .

, , ; , - , , . , , , , NLP ( ).

- Jena RDF, RDF ( - Mapper ). , WikiPedia, , WikiPedia ( , ? - , ?:), .. , SeeAlso, , ..

( PHP Perl, CPAN, - ), - . , , .. , , . . , .

, , . !:)

+2

SemanticHacker , , API. , .

  • " " > // /
  • "Halloumni" > N/A
  • "BMW" > //
  • "" > //
  • "Impala" > / /Chevrolet
  • " " > ///
  • "Bratwurst" > //
  • "Cheeseburger" > // ; / / //
+2

, Bayesian Network. - Solr.

Also check out CI-Bayes . Joseph Oettinger wrote an article about this on theserverside.net earlier this year.

0
source

Source: https://habr.com/ru/post/1709347/


All Articles