Spark source example not working

I am learning Spark and wanted to run the simplest possible cluster consisting of two physical machines. I did all the basic setup, and everything seems to be in order. The output of the autostart script is as follows:

[username@localhost sbin]$ ./start-all.sh 
starting org.apache.spark.deploy.master.Master, logging to /home/username/spark-1.6.0-bin-hadoop2.6/logs/spark-username-org.apache.spark.deploy.master.Master-1-localhost.out
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /home/sername/spark-1.6.0-bin-hadoop2.6/logs/spark-username-org.apache.spark.deploy.worker.Worker-1-localhost.out
username@192.168.???.??: starting org.apache.spark.deploy.worker.Worker, logging to /home/username/spark-1.6.0-bin-hadoop2.6/logs/spark-username-org.apache.spark.deploy.worker.Worker-1-localhost.localdomain.out

therefore, there are no errors and it seems that the node wizard is working, as well as two Worker nodes. However, when I open WebGUI at 192.168.Â?. :: 8080, it displays only one worker - local. My problem is similar to that described here: Spark Clusters: employee information is not displayed in the web user interface, but nothing happens in my / etc / hosts file. All that it contains:

127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6 

What am I missing? Both machines are running Fedora Workstation x86_64.

+4
3

, localhost. :

starting org.apache.spark.deploy.master.Master, logging to 
/home/.../spark-username-org.apache.spark.deploy.master.Master-1-localhost.out

. :

16/02/17 11:13:54 WARN Utils: Your hostname, localhost resolves to a loopback address: 127.0.0.1; using 192.168.128.224 instead (on interface eno1)

:

16/02/17 11:13:58 WARN Worker: Failed to connect to master localhost:7077
java.io.IOException: Failed to connect to localhost/127.0.0.1:7077
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
    at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:167)
    at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:200)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:187)
    at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:183)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: localhost/127.0.0.1:7077

, localhost , , . , .

:

  • , , , , IP-.
  • ssh .
+3

, IP . . , .

,

SPARK_MASTER_IP=YOUR_SPARK_MASTER_IP ${SPARK_HOME}/sbin/start-master.sh

,

${SPARK_HOME}/sbin/start-slave.sh spark://**YOUR_SPARK_MASTER_IP**:PORT

!

+2

I had a similar problem that was resolved by providing SPARK_MASTER_IP in $ SPARK_HOME / conf / spark-env.sh. spark-env.sh essentially sets the SPARK_MASTER_IP environment variable, which indicates the binding of IP to Master. Then start-master.sh reads this variable and binds Master to it. Now SPARK_MASTER_IP is displayed outside the field where the Wizard runs.

0
source

Source: https://habr.com/ru/post/1628989/


All Articles