Spark Direct mode, how to get applicationId from spark-submit

Question

Spark Direct mode, how to get applicationId from spark-submit

When I submit a spark job using spark-submit using a master yarn and a deployment cluster, it does not print / return any applicationId and after completing the job I need to manually check the MapReduce jobHistory or fix the HistoryServer to get the job.
My cluster is used by many users, and it takes a long time to determine my work in jobHistory / HistoryServer.

is there any way to configure spark-submitto return applicationId?

Note. I found many similar questions, but their solutions extract the application in the driver code using sparkcontext.applicationId, and in the case the master yarn and deploy-mode clusterdriver also starts as part of the mapreduce job, any logs or sysout printed in the remote hosts log.

+2

hadoop apache-spark spark-submit mapr

Rahul sharma May 26, '17 at 20:10

source share

1 answer

Rahul sharma · Accepted Answer · 2017-06-01T16:20:07+0000

Here are the approaches I used to do this:

Save the application ID to an HDFS file. (Suggested by @zhangtong in the comment).
Send an email notification using applictionId from the driver.

Spark Direct mode, how to get applicationId from spark-submit

More articles: