Spark-shell connecting to Mesos stuck in sched.cpp

Below are my spark-defaults.conf and spark-shell output

 $ cat conf/spark-defaults.conf spark.master mesos://172.16.**.***:5050 spark.eventLog.enabled false spark.broadcast.compress false spark.driver.memory 4g spark.executor.memory 4g spark.executor.instances 1 $ bin/spark-shell log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Using Spark repl log4j profile: org/apache/spark/log4j-defaults-repl.properties To adjust logging level use sc.setLogLevel("INFO") Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.5.2 /_/ Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_80) Type in expressions to have them evaluated. Type :help for more information. 15/11/15 04:56:11 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set. I1115 04:56:12.171797 72994816 sched.cpp:164] Version: 0.25.0 I1115 04:56:12.173741 67641344 sched.cpp:262] New master detected at master@172.16. **.***:5050 I1115 04:56:12.173951 67641344 sched.cpp:272] No credentials provided. Attempting to register without authentication 

It hangs here endlessly, and the Mesos Web web interface shows that many Spark frameworks have been spinning - they are constantly being registered and registered until I exit the spark-shell using Ctrl-C.

Mesos Web UI

I suspect this is partially due to the fact that my laptop has several IP addresses. When launched on the server, it goes to the next line, and the usual Scala REPL:

 I1116 09:53:30.265967 29327 sched.cpp:641] Framework registered with 9d725348-931a-48fb-96f7-d29a4b09f3e8-0242 15/11/16 09:53:30 INFO mesos.MesosSchedulerBackend: Registered as framework ID 9d725348-931a-48fb-96f7-d29a4b09f3e8-0242 15/11/16 09:53:30 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 57810. 15/11/16 09:53:30 INFO netty.NettyBlockTransferService: Server created on 57810 15/11/16 09:53:30 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/11/16 09:53:30 INFO storage.BlockManagerMasterEndpoint: Registering block manager 172.16.**.***:57810 with 2.1 GB RAM, BlockManagerId(driver, 172.16.**.***, 57810) 15/11/16 09:53:30 INFO storage.BlockManagerMaster: Registered BlockManager 15/11/16 09:53:30 INFO repl.Main: Created spark context.. Spark context available as sc. 

I run Mesos 0.25.0, created by Mesosphere, and set spark.driver.host to an address accessible from all computers in the Mesos cluster. I see that every port opened by the spark-shell process is bound to either this IP address or * .

Stack Overflow

I could not find the log files that might contain the reasons why they were unregistered. Where should I look for a solution to this problem?

+5
source share
1 answer

Mesos has a very strange idea of โ€‹โ€‹how the network works - in particular, it is important that you can establish bidirectional communication between the Master and the Framework. Therefore, both parties must have a common network route. If you run NAT or containers, you have come across this before - usually you need to set LIBPROCESS_IP to your public IP address on the side of the Framework. Perhaps this applies to multi-headed systems, for example, to your laptop.

You can find a little more information on the Internet, although, unfortunately, it is poorly documented. There is a hint on the page for their deployment scripts .

+4
source

Source: https://habr.com/ru/post/1236013/


All Articles