Hadoop WordCount Example Example: Do I need to do some performance tuning?

I am new to Hadoop .

Recently, I just made an implementation of the WordCount example.

But when I run these programs on my only node with 2 input files, only 9 words, it takes about 33 seconds to do this !!! so crazy and it makes me so confused !!!

Can someone tell me this is normal or some ???

How can I fix this problem? Remember, I just create 2 input files with 9 words in it.

Send host address: 127.0.0.1
Assignments-ACLs: All Users Allowed
Job Setup: Successful
Status: succeeded
Started: Fri Aug 05 14:27:22 CST 2011
Finished on: Fri Aug 05 14:27:53 CST 2011
Finished In: 30 Seconds

+2
source share
2 answers

Hadoop is not effective for very small tasks, since it takes longer to start the JVM, initialize the process, and others. Although it can be optimized to some extent by including reuse of the JVM.

http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse

Also, there is some work in Apache Hadoop

https://issues.apache.org/jira/browse/MAPREDUCE-1220

Not sure which release it will be included in or what the JIRA state is.

+3
source

This is not uncommon. Hadoop takes effect with large datasets. What you see is probably the initial start time for Hadoop.

+3
source

Source: https://habr.com/ru/post/1393659/


All Articles