Spark Shell - __spark_libs__.zip does not exist

Question

Spark Shell - __spark_libs__.zip does not exist

I am new to Spark and am busy setting up Spark Cluster with HA enabled.

When starting a spark shell for testing through: bash spark-shell --master yarn --deploy-mode client

I get the following error (see full error below): file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist

The application is marked as unsuccessful in the yarn web application, and no containers start.

When you run the shell through: spark-shell --master local it opens without errors.

I noticed that the files are written only to the tmp folder on the node where the shell is created.

Any help would be greatly appreciated. Let me know if additional information is required.

Environment Variables:

HADOOP_CONF_DIR = / Opt / Hadoop-2.7.3 / etc. / Hadoop /
YARN_CONF_DIR = / Opt / Hadoop-2.7.3 / etc. / Hadoop /
SPARK_HOME = / Opt / spark 2.0.2-bin-hadoop2.7 /

Full error message:

 16/11/30 21:08:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 16/11/30 21:08:49 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 16/11/30 21:09:03 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_e14_1480532715390_0001_02_000003 on host: slave2. Exit status: -1000. Diagnostics: File file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist java.io.FileNotFoundException: File file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 16/11/30 22:29:28 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED! 16/11/30 22:29:28 ERROR spark.SparkContext: Error initializing SparkContext. java.lang.IllegalStateException: Spark context stopped while waiting for backend at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:584) at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162) at org.apache.spark.SparkContext.<init>(SparkContext.scala:546) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258) at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831) at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95) at $line3.$read$$iw$$iw.<init>(<console>:15) at $line3.$read$$iw.<init>(<console>:31) at $line3.$read.<init>(<console>:33) at $line3.$read$.<init>(<console>:37) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.$print$lzycompute(<console>:7) at $line3.$eval$.$print(<console>:6) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37) at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:94) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at org.apache.spark.repl.Main$.doMain(Main.scala:68) at org.apache.spark.repl.Main$.main(Main.scala:51) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

yarn site.xml

 <configuration> <property> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>2000</value> </property> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.ha.automatic-failover.embedded</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>yarn-cluster</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.ha.id</name> <value>rm1</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>master:2181,slave1:2181,slave2:2181</value> </property> <property> <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name> <value>5000</value> </property> <property> <name>yarn.resourcemanager.work-preserving-recovery.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.address.rm1</name> <value>master:23140</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm1</name> <value>master:23130</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address.rm1</name> <value>master:23189</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>master:23188</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm1</name> <value>master:23125</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm1</name> <value>master:23141</value> </property> <property> <name>yarn.resourcemanager.address.rm2</name> <value>slave1:23140</value> </property> <property> <name>yarn.resourcemanager.scheduler.address.rm2</name> <value>slave1:23130</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address.rm2</name> <value>slave1:23189</value> </property> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>slave1:23188</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address.rm2</name> <value>slave1:23125</value> </property> <property> <name>yarn.resourcemanager.admin.address.rm2</name> <value>slave1:23141</value> </property> <property> <description>Address where the localizer IPC is.</description> <name>yarn.nodemanager.localizer.address</name> <value>0.0.0.0:23344</value> </property> <property> <description>NM Webapp address.</description> <name>yarn.nodemanager.webapp.address</name> <value>0.0.0.0:23999</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/tmp/pseudo-dist/yarn/local</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/tmp/pseudo-dist/yarn/log</value> </property> <property> <name>mapreduce.shuffle.port</name> <value>23080</value> </property> <property> <name>yarn.resourcemanager.work-preserving-recovery.enabled</name> <value>true</value> </property> </configuration>

+7

hadoop yarn apache-spark

Ryan nel Dec 01 '16 at 7:49

source share

3 answers

I see no errors in your logs, there are only warnings you can avoid by adding environment variables:

 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

For an exception: try manually setting the spark configuration for yarn: http://badrit.com/blog/2015/2/29/running-spark-on-yarn#.WD_e66IrJsM

 hdfs dfs -mkdir -p /user/spark/share/lib<br> hdfs dfs -put $SPARK_HOME/assembly/lib/spark-assembly_*.jar /user/spark/share/lib/spark-assembly.jar<br> export SPARK_JAR=hdfs://your-server:port/user/spark/share/lib/spark-assembly.jar

Hope this helps.

0

Salah LAARIDH Dec 01 '16 at 8:33

source share

You must set your configuration as master ("local [*]") from the spark session. I uninstalled and it worked.

0

Nithin Sep 09 '19 at 6:22

source share

Ryan nel · Accepted Answer · 2016-12-01T09:22:51+0000

This error occurred due to the configuration in the core-site.xml file.

Please note that to search for this file, your HADOOP_CONF_DIR env variable must be set.
In my case, I added HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/ in ./conf/spark-env.sh
See: Fixed the task running on the java.io.FileNotFoundException yarn cluster: The file does not exit, but the file goes to the main node

core-site.xml

 <configuration> <property> <name>fs.default.name</name> <value>hdfs://master:9000</value> </property> </configuration>

If this endpoint is not available or if Spark detects that the file system matches the current system, the lib files will not be distributed to other nodes in your cluster, causing the errors above.

In my node situation, I was not able to reach port 9000 on the specified host.

Debugging

Turn the log level to information. You can do it:

Copy ./conf/log4j.properties.template to ./conf/log4j.properties
In the file log4j.logger.org.apache.spark.repl.Main = INFO

Launch Spark Shell as usual. If your problem is the same as mine, you should see an informational message, for example: INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-c1a6cdcd-d348-4253-8755-5086a8931e75/__spark_libs__1391186608525933727.zip INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-c1a6cdcd-d348-4253-8755-5086a8931e75/__spark_libs__1391186608525933727.zip

This should lead to a problem, as it starts a train reaction that occurs due to missing files.

Spark Shell - __spark_libs__.zip does not exist

More articles: