Spark 2.0 Offline Dynamic Resource Launch Error

I run Spark 2.0 offline, successfully configured it to run on the server, and was also able to configure Ipython Kernel PySpark as an option in the Jupyter Notebook. Everything works fine, but I ran into the problem that for every laptop that I run, all my 4 employees are assigned to this application. Therefore, if another person from my team tries to start another laptop with the PySpark core, it just does not work until I stop the first laptop and release all the workers.

To solve this problem, I try to follow the Spark 2.0 Documentation instructions . So, on mine $SPARK_HOME/conf/spark-defaults.confI have the following lines:

spark.dynamicAllocation.enabled    true
spark.shuffle.service.enabled      true
spark.dynamicAllocation.executorIdleTimeout    10

In addition, $SPARK_HOME/conf/spark-env.shI have:

export SPARK_WORKER_MEMORY=1g
export SPARK_EXECUTOR_MEMORY=512m
export SPARK_WORKER_INSTANCES=4
export SPARK_WORKER_CORES=1

, $SPARK_HOME/sbin/start-slaves.sh, . :

16/11/24 13:32:06 INFO : ://Cerberus: 7077

2-4 :

INFO ExternalShuffleService: 7337 useSasl = false 16/11/24 13:32:08 ERROR: java.net.BindException:

(), shuffle- 7337, 2-4 " " .

(1-4), shuffle-service ( $SPARK_HOME/sbin/start-shuffle-service.sh), ($SPARK_HOME/sbin/start-slaves.sh).

? , ?

+4
1

, , , spark.shuffle.service.enabled ( , ), SparkConf, SparkContext:

sconf = pyspark.SparkConf() \
    .setAppName("sc1") \
    .set("spark.dynamicAllocation.enabled", "true") \
    .set("spark.shuffle.service.enabled", "true")
sc1 = pyspark.SparkContext(conf=sconf)

:

$SPARK_HOME/sbin/start-all.sh

shuffler:

$SPARK_HOME/sbin/start-shuffle-service.sh

. RUNNING, WAITING. ( ) , ( RUNNING).

, ,

+1

Source: https://habr.com/ru/post/1661857/


All Articles