Spark Mesos Cluster Mode Using Dispatcher

I have only one machine and you want to run spark jobs using mesos cluster mode. It may make sense to work with a cluster of nodes, but I basically want to check mesos first to see if it can use resources more efficiently (run multiple spark jobs at the same time without static splitting). I tried several ways, but to no avail. Here is what I did:

  • Create mesos and run both mesos master and slaves (2 slaves on the same computer).

    sudo ./bin/mesos-master.sh --ip=127.0.0.1 --work_dir=/var/lib/mesos sudo ./bin/mesos-slave.sh --master=127.0.0.1:5050 --port=5051 --work_dir=/tmp/mesos1 sudo ./bin/mesos-slave.sh --master=127.0.0.1:5050 --port=5052 --work_dir=/tmp/mesos2 
  • Launch the spark-meso-dispatcher

     sudo ./sbin/start-mesos-dispatcher.sh --master mesos://localhost:5050 
  • Submit the app with the dispatcher as the main URL.

     spark-submit --master mesos://localhost:7077 <other-config> <jar file> 

But it does not work:

  E0925 17:30:30.158846 807608320 socket.hpp:174] Shutdown failed on fd=61: Socket is not connected [57] E0925 17:30:30.159545 807608320 socket.hpp:174] Shutdown failed on fd=62: Socket is not connected [57] 

If I use a spark-submit --deploy-mode cluster, then I got another error message:

  Exception in thread "main" org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect to server 

It works fine if I do not use the dispatcher, but directly use mesos master url: --master mesos: // localhost: 5050 (client mode). According to the documentation , cluster mode is not supported for Mesos clusters, but they give a different command for cluster mode . So what is this confusing? My question is:

  • How can I make it work?
  • Should I use client mode instead of cluster mode if I send the application / jar directly from the node wizard?
  • If I have one computer, I must create 1 or more mesos slave processes. In principle, I have a number of spark tasks and I do not want to do a static partitioning of resources. But when using mesos without static splitting, does it look much slower?

Thanks.

+5
source share
3 answers

I use your script to try it, it may be work. Another thing, I use the ip address instead of "localhost" and "127.0.0.1", so just try again and check http: // your_dispatcher: 8081 (in the browser) if they exist.

This is my spark-submit command:

 $spark-submit --deploy-mode cluster --master mesos://192.168.11.79:7077 --class "SimpleApp" SimpleAppV2.jar 

If you succeed, you can see below

 { "action" : "CreateSubmissionResponse", "serverSparkVersion" : "1.5.0", "submissionId" : "driver-20151006164749-0001", "success" : true } 

When I got your error log like yours, restart your computer and try again. It also works.

+1
source

There seem to be two things that you are misleading: running a Spark application in a cluster (as opposed to a local one) and running a driver in a cluster.

Top Sending Applications :

The outcome-submit script in the Sparks bin directory is used to run applications in the cluster. It can use all cluster administrators supported by Sparks through a single interface, so you do not need to configure the application specifically for each of them.

So, Mesos is one of the supported cluster managers, and you can run Spark applications in the Mesos cluster .

The fact that Mesos does not support at the time of writing, starts the driver in the cluster, this is what the command line argument indicates --deploy-mode of ./bin/spark-submit . Since the default value --deploy-mode is client , you can simply omit it, or if you want to explicitly specify it, then use:

 ./bin/spark-submit --deploy-mode client ... 
+1
source

Try using port 6066 instead of 7077. Newer versions of Spark prefer REST api for submitting jobs.

See https://issues.apache.org/jira/browse/SPARK-5388

+1
source

Source: https://habr.com/ru/post/1232258/


All Articles