C ++ Hadoop WordCount pipes not working

I am trying to run the wordcount example in C ++, as this link describes a way: Running the WordCount program in C ++ . The compilation works fine, but when I tried to run my program, an error appeared:

bin / hadoop pipes -conf ../ dev / word.xml -input testtile.txt -output wordcount-out
11/06/06 14:23:40 WARN mapred.JobClient: no jar file specified. Custom classes could not be found. See JobConf (class) or JobConf # setJar (String).
11/06/06 14:23:40 INFO mapred.FileInputFormat: common input paths for processing: 1
11/06/06 14:23:40 INFO mapred.JobClient: Launch: job_201106061207_0007
11/06/06 14:23:41 INFO mapred.JobClient: map 0% reduce 0%
11/06/06 14:23:53 INFO mapred.JobClient: Task ID: Attempt_201106061207_0007_m_000000_0, Status: FAILED
java.io.IOException
at org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication (OutputHandler.java:188) at org.apache.hadoop.mapred.pipes.Application.waitForAuthentication (Application.java:194) at org.apache.hadoop.mapred .pipes.Application. (Application.java:149) at org.apache.hadoop.mapred.pipes.PipesMapRunner.run (PipesMapRunner.java:68) at org.apache.hadoop.mapred.MapTask.runOldMapper (MapTask.java:435) at org. apache.hadoop.mapred.MapTask.run (MapTask.javahaps71) at org.apache.hadoop.mapred.Child $ 4.run (Child.java:259) in java.security.AccessController.doPrivileged (native method) in javax .security.auth.Subject.doAs (Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs (UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main (Child.java : 253)
attempt_201106061207_0007_m_000000_0: the server failed authentication. Output

I run Hadoop on Fedora on two nodes, and I followed the configuration instructions from this link: Running Hadoop on a multi-node cluster . I tried the wordcount Hadoop example with this command:

bin / hadoop jar hasoop-examples-0.20.203.0.jar wordcount testtile.txt wordcount-out

And this command works great. That is why I do not understand why my program is not working. Therefore, I hope that someone has an idea of ​​what I am doing wrong, or if someone has already resolved this error.

+1
source share
1 answer

I do not know if I should answer my question this way or change my question. In any case, I find a solution, and I just want to tell everyone who gets the same error about it.

After several days of research and attempts, I understand that Fedora and C ++ on 64 bits are not suitable for Hadoop. I tried to compile Hadoop wordcount C ++ using ant as described on the wiki. But ant causes me some error: libssl and stdint.

First, if you are in Fedora, you need to add -lcrypto to the LIBS variables in .configure. This is due to the fact that the dependency on libcrypto should now be explicitly specified on this platform when binding to libssl (see error in Fedora ).

Second problem: ant creates a lot of errors in C ++ files: for the solution, you just need to add include: stdint.h at the top of the file.

Then build success. I tried running the wordcount example on my Hadoop cluster, and it works, but mine does not. I expected the problem to come from the library that I just fixed, and I was right: I tried to run the Hadoop example with the library from the hadoop installation directory, and this did not work, and I get the same error.

This can be explained by the fact that ant recompile the C ++ library needed for Hadoop (with the fix I made) and use it, instead the library provides the Hadoop installation directory in it.

+1
source

Source: https://habr.com/ru/post/1383058/


All Articles