Hadoop WordCount Example Example: Do I need to do some performance tuning?

Question

Hadoop WordCount Example Example: Do I need to do some performance tuning?

I am new to Hadoop .

Recently, I just made an implementation of the WordCount example.

But when I run these programs on my only node with 2 input files, only 9 words, it takes about 33 seconds to do this !!! so crazy and it makes me so confused !!!

Can someone tell me this is normal or some ???

How can I fix this problem? Remember, I just create 2 input files with 9 words in it.

Send host address: 127.0.0.1
Assignments-ACLs: All Users Allowed
Job Setup: Successful
Status: succeeded
Started: Fri Aug 05 14:27:22 CST 2011
Finished on: Fri Aug 05 14:27:53 CST 2011
Finished In: 30 Seconds

+2

java hadoop

jackalope Aug 05 '11 at 7:48

source share

2 answers

This is not uncommon. Hadoop takes effect with large datasets. What you see is probably the initial start time for Hadoop.

+3

Otto allmendinger Aug 05 '11 at 7:53

source share

Praveen sripati · Accepted Answer · 2011-08-05T09:51:49+0000

Hadoop is not effective for very small tasks, since it takes longer to start the JVM, initialize the process, and others. Although it can be optimized to some extent by including reuse of the JVM.

http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Task+JVM+Reuse

Also, there is some work in Apache Hadoop

https://issues.apache.org/jira/browse/MAPREDUCE-1220

Not sure which release it will be included in or what the JIRA state is.

Hadoop WordCount Example Example: Do I need to do some performance tuning?

More articles: