Spark: Executor Lost Failure (After adding a GroupBy job)

Question

Spark: Executor Lost Failure (After adding a GroupBy job)

I am trying to run Spark work on a Yarn client. I have two nodes, and each node has the following configurations.

Im getting "ExecutorLostFailure (performer 1 lost)".

I tried most of the Spark setup configuration. I reduced to one artist the lost, because initially I got as 6 artist failures.

This is my configuration (my spark-submit):

HADOOP_USER_NAME = hdfs spark-submit --class genkvs .CreateFieldMappings --master yarn-client --driver-memory 11g --executor-memory 11G --total-executor-core 16 --num-executors 15 --conf "spark.executor.extraJavaOptions = -XX: + UseCompressedOops -XX: + PrintGCDetails -XX: + PrintGCTimeStamps" --conf spark.akka.frameSize = 1000 --conf spark.shuffle.memoryFraction = 1 --conf spark.rdd.compress = true --conf spark.core.connection.ack.wait.timeout = 800 my-data/lookup_cache_spark-assembly-1.0-SNAPSHOT.jar -h hdfs://hdp- node -1.zone24x7.lk:8020 -p 800

6 , groupBy .

def process(in: RDD[(String, String, Int, String)]) = {
    in.groupBy(_._4)
}

, , , . , , .

.

+4

scala out-of-memory hadoop apache-spark executors

Renien 11 . '15 5:31

1

javadba · Answer 1 · 2015-11-11T05:37:45+0000

:

spark.shuffle.memoryFraction 1. , 0.2? .
11G 16 . 11G 3 - ( ) 1. 16 700mb, , OOME/ .

Spark: Executor Lost Failure (After adding a GroupBy job)

More articles: