How to start a map reduction job using the Java -jar command

I am writing a map, shortening Job using Java. Configuration setting

Configuration configuration = new Configuration(); configuration.set("fs.defaultFS", "hdfs://127.0.0.1:9000"); configuration.set("mapreduce.job.tracker", "localhost:54311"); configuration.set("mapreduce.framework.name", "yarn"); configuration.set("yarn.resourcemanager.address", "localhost:8032"); 

Starting using a different case

Case 1: "Using the Hadoop and Yarn Command": Success

Case 2: Using Eclipse: Success

case 3: "Using Java-jar after deleting the entire .set () configuration":

  Configuration configuration = new Configuration(); Run successful but not display Job status on Yarn (default port number 8088) 

case 4: "Using Java-jar": error

  Find stack trace:Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1255) at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1251) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.mapreduce.Job.connect(Job.java:1250) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1279) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at com.my.cache.run.MyTool.run(MyTool.java:38) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at com.my.main.Main.main(Main.java:45) 

I ask you to tell me how to do the job of shrinking the map using the "Java-jar" command, as well as check the status and enter the line "Yarn" (the default port is 8088).

Why you need it: you want to create a web service and send work to reduce the map (without using the Java runtime library to execute the Yarn or Hadoop command).

+5
source share
1 answer

In my opinion, it was rather difficult to run the hadoop application without the hadoop command. You are better off using a haopa gang than java -jar.

I think you do not have hadoop environment in your machine. First, you need to make sure that chaos works well on your computer.

Personally, I prefer configuring to mapred-site.xml, core-site.xml, yarn-site.xml, hdfs-site.xml. I know a clear hadoop cluster installation tutorial in here

At this point, you can control hdfs in port 50070, a yarn cluster in port 8088, mapreduce history history in port 19888.

You must then prove that your hdfs and yarn environments work well. For the hdfs environment, you can try using the simple hdfs command, such as mkdir, copyToLocal, copyFromLocal, etc., And for the yarn environment, you can try the sample wordcount project.

After the hasoop environment, you can create your own mapreduce application (you can use any IDE). you might need this for a tutorial. compile it and make it in the bank.

open your terminal and run this command

 hadoop jar <path to jar> <arg1> <arg2> ... <arg n> 

hope this helps.

+1
source

Source: https://habr.com/ru/post/1200297/


All Articles