Separation of application logs in Logback from Spark logs to log4j

I have a Scala Maven project using Spark and am trying to log in using Logback. I compile my application into a jar and deploy it to an EC2 instance where the Spark distribution is installed. My pom.xml includes dependencies for Spark and Logback as follows:

        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
            <version>1.1.7</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>log4j-over-slf4j</artifactId>
            <version>1.7.7</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
            <exclusions>
                <exclusion>
                    <groupId>org.slf4j</groupId>
                    <artifactId>slf4j-log4j12</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>log4j</groupId>
                    <artifactId>log4j</artifactId>
                </exclusion>
            </exclusions>
        </dependency>

When I submit my Spark application, I print the slf4j binding on the command line. If I execute jars code using java, the binding will be to Logback. However, if I use Spark (e.g. spark-submit), binding to log4j.

  val logger: Logger = LoggerFactory.getLogger(this.getClass)
  val sc: SparkContext = new SparkContext()
  val rdd = sc.textFile("myFile.txt")

  val slb: StaticLoggerBinder = StaticLoggerBinder.getSingleton
  System.out.println("Logger Instance: " + slb.getLoggerFactory)
  System.out.println("Logger Class Type: " + slb.getLoggerFactoryClassStr)

gives

Logger Instance: org.slf4j.impl.Log4jLoggerFactory@a64e035
Logger Class Type: org.slf4j.impl.Log4jLoggerFactory

, log4j-1.2.17.jar slf4j-log4j12-1.7.16.jar /usr/local/spark/jars, Spark, , , pom.xml, , , ClassNotFoundException spark-submit.

: , Logback, Spark. Spark STDOUT.

+4
3

: . , .

logback grizzled-slf4j, SBT:

"org.clapper" %% "grizzled-slf4j" % "1.3.0",

log4j:

src/main/resources/log4j.properties/log4j.properties files.

.

+3

.

( sbt) : fooobar.com/questions/1669208/... p >

, spark-submit (logback), . log4j 1.2.xx, .

. Spark 1.6.1 docs ( Spark latest/2.2.0):

spark.driver.extraClassPath

classpath . . SparkConf , JVM . -driver-class-path .

spark.executor.extraClassPath

classpath . Spark. .

, , extraClassPath Spark!

, .

1. :

- log4j-over-slf4j-1.7.25.jar
- logback-classic-1.2.3.jar
- logback-core-1.2.3.jar

2. spark-submit:

libs="/absolute/path/to/libs/*"

spark-submit \
  ...
  --master yarn \
  --conf "spark.driver.extraClassPath=$libs" \
  --conf "spark.executor.extraClassPath=$libs" \
  ...
  /my/application/application-fat.jar \
  param1 param2

, HDFS. .

userClassPathFirst

, Spark 1.6.1, :

spark.driver.userClassPathFirst, spark.executor.userClassPathFirst

() Spark . Spark . . .

:

--conf "spark.driver.userClassPathFirst=true" \
--conf "spark.executor.userClassPathFirst=true" \

. extraClassPath!

!


logback.xml

logback.xml Spark, : ,

+1

: . , org.slf4j, . , logback.xml jar .

sbt, :

assemblyShadeRules in assembly += ShadeRule.rename(s"org.slf4j.**" -> "your_favourite_prefix.@0").inAll

build.sbt.

Side note . If you are not sure if the shadowing is really happening, open your bank in some archive browser and check if the directory structure reflects the hatched one, in this case your bank should contain a path /your_favourite_prefix/org/slf4j, but not/org/slf4j

0
source

Source: https://habr.com/ru/post/1669207/


All Articles