SparkConf options not used when starting a Spark application in cluster mode in YARN

I wrote a Spark application that installs many configuration items through an instance SparkConf, for example:

SparkConf conf = new SparkConf().setAppName("Test App Name");

conf.set("spark.driver.cores", "1");
conf.set("spark.driver.memory", "1800m");

conf.set("spark.yarn.am.cores", "1");
conf.set("spark.yarn.am.memory", "1800m");

conf.set("spark.executor.instances", "30");
conf.set("spark.executor.cores", "3");
conf.set("spark.executor.memory", "2048m");

JavaSparkContext sc = new JavaSparkContext(conf);

JavaRDD<String> inputRDD = sc.textFile(...);
...

When I run this application with the command ( master=yarnand deploy-mode=client)

spark-submit --class spark.MyApp --master yarn --deploy-mode client /home/myuser/application.jar

everything works fine, the Spark History user interface displays the correct information about the artist: enter image description here

But when starting with ( master=yarnand deploy-mode=cluster)

my Spark UI shows incorrect artist information (~ 512 MB instead of ~ 1400 MB): enter image description here

Test App Name , spark.MyApp . , . ? ?

Spark 1.6.2 HDP 2.5, YARN.

+4
2

, , ! : Spark YARN!


, Spark , Spark, . http://spark.apache.org/docs/1.6.2/configuration.html

CPU/RAM ( ):

  • spark.executor.cores
  • spark.executor.memory
  • spark.driver.cores
  • spark.driver.memory

: Spark Hadoop, YARN, :

  • " ", , : http://spark.apache.org/docs/1.6.2/running-on-yarn.html ( , , , Configuration (, , , YARN !))

  • SparkConf -! spark-submit:

    • --executor-cores 5
    • --executor-memory 5g
    • --driver-cores 3
    • --driver-memory 3g
  • - spark.driver.cores spark.driver.memory! AM SparkConf:

    • spark.yarn.am.cores
    • spark.yarn.am.memory
    • AM spark-submit!
  • -,
    • spark.executor.cores spark.executor.memory SparkConf
    • --executor-cores executor-memory spark-submit
    • , SparkConf spark-submit!

:

enter image description here

, - ...

+4

. :

, . Pypark 2.0.0 YARN.

, , - script ( SparkConf), .

, 2 . :

.ApplicationMaster:

.ApplicationMaster: 143


1: , SparkConf

spark = (SparkSession
    .builder
    .appName("driver_executor_inside")
    .enableHiveSupport()
    .config("spark.executor.memory","4g")
    .config("spark.executor.cores","2")
    .config("spark.yarn.executor.memoryOverhead","1024")
    .config("spark.driver.memory","2g")
    .getOrCreate())

spark-submit --master yarn --deploy-mode cluster myscript.py

Job completed with status disabled Error message Spark UI Executive Summary


2: - - SparkConf script

spark = (SparkSession
    .builder
    .appName("executor_inside")
    .enableHiveSupport()
    .config("spark.executor.memory","4g")
    .config("spark.executor.cores","2")
    .config("spark.yarn.executor.memoryOverhead","1024")
    .getOrCreate())

spark-submit --master yarn --deploy-mode cluster --conf spark.driver.memory=2g myscript.py

. ​​ .

Spark UI Executive Summary


3: - -

spark = (SparkSession
    .builder
    .appName("executor_not_written")
    .enableHiveSupport()
    .config("spark.executor.cores","2")
    .config("spark.yarn.executor.memoryOverhead","1024")
    .getOrCreate())

spark-submit --master yarn --deploy-mode cluster --conf spark.driver.memory=2g myscript.py

Spark UI Executive Summary

-, . CASE 2 , , sparkConf.

+1

Source: https://habr.com/ru/post/1676377/


All Articles