Measure Hadoop working time with JobControl

I used to run my Hadoop job with the following

long start = new Date().getTime(); boolean status = job.waitForCompletion(true); long end = new Date().getTime(); 

That way, I could measure the time taken by the job when it ends directly in my code.

Now I have to use JobControl to express the dependencies between my jobs:

 JobControl jobControl = new JobControl("MyJob"); jobControl.addJob(job1); jobControl.addJob(job2); job3.addDependingJob(job2); jobControl.addJob(job3); jobControl.run(); 

However, once jobControl.run () has been executed, the code never goes further, so I can’t include the code for polling jobControl.getState () to complete the job.

How to measure job execution time using JobControl?

0
source share
1 answer

JobControl does not have good features that allow you to connect and receive this information. You have some (potentially painful) options:

  • Run JobControl.run() in a separate thread and in your main thread, query JobControl.getXXXJobs() methods to track job status changes
  • Look at using the URL of the job’s final URL , but to do this you need to run a β€œserver” on your client to receive an event notification, and then try to work back when the job finishes.
  • Extend JobControl and jobcontrol.Job objects to track when a job changes and add methods to request start / end times
+1
source

Source: https://habr.com/ru/post/921702/


All Articles