The difference between the client thread mode and the cluster yarn mode

I had a small problem when running similar code in client thread mode, as well as in thread cluster mode. My code works fine when I run it in client mode, but it doesn’t work when it runs on a node yarn-cluster.

It throws a file exception without a file, which indicates that the pyspark.zip file was not found. Any understanding of this would be helpful.

+4
source share
1 answer

In direct cluster mode, the driver runs in the Application Wizard (inside the YARN container). In the "yarn-client" mode, it runs in the client.

In thread cluster mode, the spark shell is not supported.

: Spark ? 1.4 pyspark - (. SPARK-5162)

+4

Source: https://habr.com/ru/post/1608002/


All Articles