I deployed a can of sparks on the cluster. I pass on the spark mission using the team spark-submitfollowing my project flask.
I have a lot Spak Confin my project. Conf will be decided based on which class I run, but every time I run a spark, I get this warning.
7/01/09 07:32:51 WARN SparkContext: use the existing SparkContext, some configurations may not take effect.
Request Does this mean that SparkContext already exists, and my job selects this.
Request Why configuration does not occur
The code
private val conf = new SparkConf()
.setAppName("ELSSIE_Ingest_Cassandra")
.setMaster(sparkIp)
.set("spark.sql.shuffle.partitions", "8")
.set("spark.cassandra.connection.host", cassandraIp)
.set("spark.sql.crossJoin.enabled", "true")
object SparkJob extends Enumeration {
val Program1, Program2, Program3, Program4, Program5 = Value
}
object ElssieCoreContext {
def getSparkSession(sparkJob: SparkJob.Value = SparkJob.RnfIngest): SparkSession = {
val sparkSession = sparkJob match {
case SparkJob.Program1 => {
val updatedConf = conf.set("spark.cassandra.output.batch.size.bytes", "2048").set("spark.sql.broadcastTimeout", "2000")
SparkSession.builder().config(updatedConf).getOrCreate()
}
case SparkJob.Program2 => {
val updatedConf = conf.set("spark.sql.broadcastTimeout", "2000")
SparkSession.builder().config(updatedConf).getOrCreate()
}
}
}
And in Program1.scala I call
val spark = ElssieCoreContext.getSparkSession()
val sc = spark.sparkContext