Design Idea for Computational Linguistics Using Hadoop MapReduce

I need to do a project on the course of computational linguistics. Is there any interesting “linguistic” problem that is informative enough to work with a Hadoop card? The solution or algorithm should try to analyze and give some idea of ​​the "linguistic" domain. however, it must be applied to large datasets so that I can use chaop for this. I know there is a python natural language processing tool for hadoop.

+3
source share
4 answers

If you have large cases in some "unusual" languages ​​(in the sense of "those for which limited amounts of computational linguistics have been performed"), repeating some existing work on computational linguistics already done for very popular languages ​​(such as English, Chinese, Arabic, ...) is a completely suitable project (especially in an academic setting, but it can be quite suitable for the industry too - back when I was in computational linguistics with IBM Research. I got an interesting run from the volume The case for the Italian and the repetition [[at the relatively new IBM science center in Rome]] is very similar to what the IBM Research team at Yorktown Heights [[of which I was part]] has already done for the English language.

/ ( , IBM Italy, , ).

, , : ( ..), "" ? , , , , , , , , , , , , .

, , "" , ? ( ), ( , , - CL !).

+2

300M 60K OA, BioMed Central. . , - - , .

Hadoop - , , , , . , , . .

, .

  • BioNLP.org

    -

+2
+2

, Python NLTK, dumbo Hadoop.

PyCon 2010 spoke well only on this topic. You can access the slides from the conversation using the link below.

+1
source

Source: https://habr.com/ru/post/1734813/


All Articles