I am trying to run a spark application written in scala 11.8, spark 2.1 on an EMR cluster version 5.3.0. I configured a cluster with the following json:
[ { "Classification": "hadoop-env", "Configurations": [ { "Classification": "export", "Configurations": [], "Properties": { "JAVA_HOME": "/usr/lib/jvm/java-1.8.0" } } ], "Properties": {} }, { "Classification": "spark-env", "Configurations": [ { "Classification": "export", "Configurations": [], "Properties": { "JAVA_HOME": "/usr/lib/jvm/java-1.8.0" } } ], "Properties": {} } ]
If I try to start client mode, everything works fine. when trying to run the application in cluster mode, it failed with status code 12.
Here is the part of the main log where I see the status code:
17/02/01 10:08:26 INFO TaskSetManager: Completed task 79.0 in stage 0.0 (TID 79) in 293 ms on ip-10-234-174-231.us-west-2.compute.internal (executor 2) (78/11102) 02/17/01 10:08:27 INFO YarnAllocator: the driver requested from 19290 performers. 02/17/01 10:08:27 INFO ApplicationMaster: Final application status: FAILED, exitCode: 12, (reason: An exception was thrown 1 time from the Reporter thread). 02/17/01 10:08:27 INFO SparkContext: calling stop () from shutdown
UPDATE:
As part of the assignment, I need to read some data from s3, something like this: sc.textFile( "s3n://stambucket/impressions/*/2017-01-0[1-9]/*/impression_recdate*) If I Iβll take only one day, there will be no errors, but from 9 I get this exit code 12. This is even strange, given the fact that 9 days in client mode are just wonderful.
source share