In this problem you need to make some changes to several configurations, you need to add some changes to your yarn-default.xml file. In this file you need to change this line or add this line:
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds=3600
This modification will bring your files to you, it will allow you to see data through yarn logs -applicationId YOUR_APP_ID
This is the first step. You can see a little about it here .
Seccond step you need to create the log4j-driver.property and log4j-executor.property file
In this file you can use this example:
log4j.rootLogger=INFO, rolling log4j.appender.rolling=org.apache.log4j.RollingFileAppender log4j.appender.rolling.layout=org.apache.log4j.PatternLayout log4j.appender.rolling.layout.conversionPattern=[%d] %p %m (%c)%n log4j.appender.rolling.maxFileSize=50MB log4j.appender.rolling.maxBackupIndex=5 log4j.appender.rolling.file=/var/log/spark/${dm.logging.name}.log log4j.appender.rolling.encoding=UTF-8 log4j.logger.org.apache.spark=WARN log4j.logger.org.eclipse.jetty=WARN log4j.logger.com.anjuke.dm=${dm.logging.level}
What are these lines?
This guy: log4j.appender.rolling.maxFileSize=50MB will only create 50 MB files. When the log file reaches 50 MB, it will be closed and a new one will start.
Another important line: log4j.appender.rolling.maxBackupIndex=5 This means that you will have a backup history of 5 files of 50 MB in size. Over time, this will be deleted when new files begin to be displayed.
After creating this log file, you need to send it using the spark-submit command:
spark-submit --master spark://127.0.0.1:7077 --driver-java-options "-Dlog4j.configuration=file:/path/to/log4j-driver.properties -Ddm.logging.level=DEBUG" --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=file:/path/to/log4j-executor.properties -Ddm.logging.name=myapp -Ddm.logging.level=DEBUG" ...
You can create a log file for your Driver and for your Workers. As a team, I use two different files, but you can use them. For more details you can see here .