If you are not using spark-submit, it is best to override here SPARK_CONF_DIR. Create a separate directory for each set of configurations:
$ configs tree
.
├── conf1
│ ├── docker.properties
│ ├── fairscheduler.xml
│ ├── log4j.properties
│ ├── metrics.properties
│ ├── spark-defaults.conf
│ ├── spark-defaults.conf.template
│ └── spark-env.sh
└── conf2
├── docker.properties
├── fairscheduler.xml
├── log4j.properties
├── metrics.properties
├── spark-defaults.conf
├── spark-defaults.conf.template
└── spark-env.sh
And set the environment variable before initializing any JVM dependent objects:
import os
from pyspark.sql import SparkSession
os.environ["SPARK_CONF_DIR"] = "/path/to/configs/conf1"
spark = SparkSession.builder.getOrCreate()
or
import os
from pyspark.sql import SparkSession
os.environ["SPARK_CONF_DIR"] = "/path/to/configs/conf2"
spark = SparkSession.builder.getOrCreate()
This is a workaround and may not work in complex scenarios.
source
share