How to specify which version of java to use in spark-submit command?

Question

How to specify which version of java to use in spark-submit command?

I want to run a spark flow application in a yarn cluster on a remote server. The default java version is 1.7, but I want to use 1.8 for my application, which is also present on the server but not standard. Is there a way to specify through javascript to represent the location of java 1.8 so that I don't get major.minor error?

+5

java yarn spark-streaming

Priyanka Apr 26 '16 at 11:27

source share

3 answers

mathieu · Answer 1 · 2017-02-03T14:58:25+0000

JAVA_HOME in our case was not enough, the driver worked in java 8, but later I found that the Spark workers in YARN were launched using java 7 (hadoop was installed as a Java version on nodes).

I had to add spark.executorEnv.JAVA_HOME=/usr/java/<version available in workers> to spark-defaults.conf . Note that you can provide it on the command line with --conf .

See http://spark.apache.org/docs/latest/configuration.html#runtime-environment

Radu · Answer 2 · 2016-10-28T06:43:55+0000

Although you can make the driver code work on a specific version of Java ( export JAVA_HOME=/path/to/jre/ && spark-submit ... ), workers will execute code with the default version of Java from the user PATH of the yarn from the work computer.

What you can do is set up each Spark instance to use a specific JAVA_HOME by editing spark-env.sh files ( documentation ).

Carlos Gomez · Answer 3 · 2018-03-15T17:24:35+0000

Add the JAVA_HOME you want in spark-env.sh (sudo find -name spark-env.sh ... ej.:/etc/spark2/conf.cloudera.spark2_on_yarn/spark-env.sh)

How to specify which version of java to use in spark-submit command?

More articles: