Spark Direct mode, how to get applicationId from spark-submit

When I submit a spark job using spark-submit using a master yarn and a deployment cluster, it does not print / return any applicationId and after completing the job I need to manually check the MapReduce jobHistory or fix the HistoryServer to get the job.
My cluster is used by many users, and it takes a long time to determine my work in jobHistory / HistoryServer.

is there any way to configure spark-submitto return applicationId?

Note. I found many similar questions, but their solutions extract the application in the driver code using sparkcontext.applicationId, and in the case the master yarn and deploy-mode clusterdriver also starts as part of the mapreduce job, any logs or sysout printed in the remote hosts log.

+2
source share
1 answer

Here are the approaches I used to do this:

  • Save the application ID to an HDFS file. (Suggested by @zhangtong in the comment).
  • Send an email notification using applictionId from the driver.
0
source

Source: https://habr.com/ru/post/1622707/


All Articles