We are currently facing a problem where Spark jobs see that the number of containers is being killed for exceeding memory limits when running on YARN.
16/11/18 17:58:52 WARN TaskSetManager: Lost task 53.0 in stage 49.0 (TID 32715, XXXXXXXXXX):
ExecutorLostFailure (executor 23 exited caused by one of the running tasks)
Reason: Container killed by YARN for exceeding memory limits. 12.4 GB of 12 GB physical memory used.
Consider boosting spark.yarn.executor.memoryOverhead.
The following arguments are passed through spark-submit:
--executor-memory=6G
--driver-memory=4G
--conf "spark.yarn.executor.memoryOverhead=6G"`
I am using Spark 2.0.1.
We increased the memoryOverhead value to this value after reading several messages about containers for destroying YARN (for example, How can I prevent the Spark artist from losing and destroying a container with yarn due to memory limitations? ).
Given my options and the log message, it seems that "Yarn kills executors when its memory usage is greater than (executor-memory + executor.memoryOverhead)."
, , . . , , , , .. ..