Submit a Hadoop Job

I need to constantly get runtimes for markers and gears. I applied as follows.

JobClient jobclient = new JobClient(conf); RunningJob runjob = jobclient.submitJob(conf); TaskReport [] maps = jobclient.getMapTaskReports(runjob.getID()); long mapDuration = 0; for(TaskReport rpt: maps){ mapDuration += rpt.getFinishTime() - rpt.getStartTime(); } 

However, when I run the program, it seems that the task has not been sent, and it does not start. How can I use JobClient.runJob(conf) and still be able to get work time?

+2
source share
1 answer

The submitJob() method immediately returns control to the calling program, without waiting for the hadoop job to run, and moreover, its complete completion. If you want to wait, use the waitForCompletion() method, which returns only after the hadoop job has completed. I think you want something in between, since you want to run the following code after sending, but before completion.

I suggest you put your next code in a loop that continues until the task is completed (use the isComplete() method for this test) and observe the mappers and gearboxes as the task progresses. You probably also want to put Thread.sleep (xxx) in a loop.

To answer your comment you want ...

 job.waitForCompletion(); TaskCompletionEvent event[] = job.getTaskCompletionEvents(); for (int i = 0; i < event.length(); i++) { System.out.println("Task "+i+" took "+event[i].getTaskRunTime()+" ms"); } 
+1
source

Source: https://habr.com/ru/post/1482668/


All Articles