Spark Autonomous Cluster Authentication

I have a standalone Spark cluster running on a remote server, and I'm new to Spark. Apparently, by default, there is no authentication scheme that protects the cluster port (7077) . Anyone can just simply send their own code to the cluster without any restrictions.

The Spark documentation claims that authentication is possible in offline overclocking using a parameter spark.authenticate.secret, but it’s not really designed how exactly this should be used.

Is it possible to use some kind of common secret that would prevent any potential attacker from sending tasks to the cluster? Can someone explain how exactly this can be configured?

+4
source share
1 answer

There are two parts to authentication support:

  • privacy setting on the main and all subordinates
  • using the same secret when sending jobs to the cluster

master and slaves

on each server in your cluster add the following configuration to conf/spark-defaults.conf:

spark.authenticate.secret      SomeSecretKey

job submission

when you initialize the spark context, you must add the same configuration to it, namely:

val conf = new SparkConf()
      .set("spark.authenticate.secret", "SomeSecretKey")
val sc = new SparkContext(conf)

or if you use SparkSession:

val spark = SparkSession.builder()
    .conf("spark.authenticate.secret", "SomeSecretKey")
    .getOrCreate()
+1
source

Source: https://habr.com/ru/post/1691119/


All Articles