Apache Spark Stderr and Stdout

Question

Apache Spark Stderr and Stdout

I run spark-1.0.0, connecting to a spark-free autonomous cluster that has one master and two slaves. I ran wordcount.py from Spark-submit, it actually reads data from HDFS and also writes the results to HDFS. So far, everything is in order, and the results will be correctly recorded in HDFS. But what bothers me is that when I check Stdout for each worker, it is empty, I don’t know if it should be empty? and I got the following in stderr:

Stderr log page for some (app-20140704174955-0002)

Spark Executor Command: "java" "-cp" ":: /usr/local/spark-1.0.0/conf: /usr/local/spark-1.0.0 /assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.2.1.jar:/usr/local/hadoop/conf" " -XX:MaxPermSize=128m" "-Xms512M" "-Xmx512M" "org.apache.spark.executor.CoarseGrainedExecutorBackend " "akka.tcp:// spark@master :54477/user/CoarseGrainedScheduler" "0" "slave2" "1 " "akka.tcp:// sparkWorker@slave2 :41483/user/Worker" "app-20140704174955-0002" ======================================== 14/07/04 17:50:14 ERROR CoarseGrainedExecutorBackend: Driver Disassociated [akka.tcp:// sparkExecutor@slave2 :33758] -> [akka.tcp:// spark@master :54477] disassociated! Shutting down.

+6

apache-spark

user3789843 Jul 04 '14 at 10:11

source share

2 answers

samthebest · Answer 1 · 2014-07-06T09:45:32+0000

Spark always writes everything, even INFO for stderr. People seem to do this to stop stdout message buffering and make logging less predictable. This is an acceptable practice when he knew that the application would never be used in bash scripts, therefore it is especially often used for logging.

Asshat_ · Answer 2 · 2015-12-22T21:34:13+0000

Try this in the log4j.properties passed to Spark (or change the default configuration in Spark / conf)

 # Log to stdout and stderr log4j.rootLogger=INFO, stdout, stderr # Send TRACE - INFO level to stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.Threshold=TRACE log4j.appender.stdout.Target=System.out log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.filter.filter1=org.apache.log4j.varia.LevelRangeFilter log4j.appender.stdout.filter.filter1.levelMin=TRACE log4j.appender.stdout.filter.filter1.levelMax=INFO log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n # Send WARN or higher to stderr log4j.appender.stderr=org.apache.log4j.ConsoleAppender log4j.appender.stderr.Threshold=WARN log4j.appender.stderr.Target =System.err log4j.appender.stderr.layout=org.apache.log4j.PatternLayout log4j.appender.stderr.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n # Change this to set Spark log level log4j.logger.org.apache.spark=WARN log4j.logger.org.apache.spark.util=ERROR

In addition, progress indicators shown at the INFO level are sent to stderr.

Disable with

 spark.ui.showConsoleProgress=false

Apache Spark Stderr and Stdout

More articles: