Spark: run from single JVM jobs with different memory / core configurations at the same time

Question

Spark: run from single JVM jobs with different memory / core configurations at the same time

Explanation of the problem

Suppose you have a Spark cluster with an autonomous manager, where the task is scheduled through SparkSessioncreated in the client application. The client application runs on the JVM. And you must run each task with different configurations for the sake of productivity, see the example of types of work below.

The problem is, you cannot create two sessions from the same JVM .

So, how are you going to run several Spark jobs with different session configurations at the same time?

In different session configurations, I mean:

spark.executor.cores
spark.executor.memory
spark.kryoserializer.buffer.max
spark.scheduler.pool
etc.

My thoughts

Possible solutions to the problem:

Spark SparkSession. ?
JVM, SparkSession, Spark. , . - 2-3 . , .
. .
Spark Spark. , (, Hazelcast) Spark . , : , ..

. , IO - . , .
. , .
, .
- 1-2 3, .
.

0

java architecture configuration distributed-computing apache-spark

Volodymyr Bakhmatiuk 09 . '17 14:54

1

FaigB · Accepted Answer · 2017-03-11T19:39:53+0000

Spark standalone FIFO . . , . , , .., SparkConf.

Apache Mesos . ( Apache Mesos), , . , . Apache Mesos , , , . Apache Mesos - , Spark , , . , , .

Apache Hadoop YARN ResourceManager , ApplicationManager. . : CapacityScheduler, , , FairScheduler, , , . , , . . ApplicationManager ApplicationMaster. ApplicationMaster - Spark. Spark SparkConf .

, - ,

Spark: run from single JVM jobs with different memory / core configurations at the same time

Explanation of the problem

My thoughts

More articles: