I have a spark setup on 3 machines using a tar file. I donβt have any pre-configuration, I edited the slaves file and started working with wizards and workers. I can see sparkUI on port 8080. Now I want to run a simple python script on a spark cluster.
import sys from random import random from operator import add from pyspark import SparkContext if __name__ == "__main__": """ Usage: pi [partitions] """ sc = SparkContext(appName="PythonPi") partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2 n = 100000 * partitions def f(_): x = random() * 2 - 1 y = random() * 2 - 1 return 1 if x ** 2 + y ** 2 < 1 else 0 count = sc.parallelize(xrange(1, n + 1), partitions).map(f).reduce(add) print "Pi is roughly %f" % (4.0 * count / n) sc.stop()
I run this command
spark-submit --master spark: // IP: 7077 pi.py 1
But getting the following error
14/12/22 18:31:23 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 14/12/22 18:31:38 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/12/22 18:31:43 INFO client.AppClient$ClientActor: Connecting to master spark://10.77.36.243:7077... 14/12/22 18:31:53 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/12/22 18:32:03 INFO client.AppClient$ClientActor: Connecting to master spark://10.77.36.243:7077... 14/12/22 18:32:08 WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory 14/12/22 18:32:23 ERROR cluster.SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up. 14/12/22 18:32:23 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/12/22 18:32:23 INFO scheduler.TaskSchedulerImpl: Cancelling stage 0 14/12/22 18:32:23 INFO scheduler.DAGScheduler: Failed to run reduce at /opt/pi.py:21 Traceback (most recent call last): File "/opt/pi.py", line 21, in <module> count = sc.parallelize(xrange(1, n + 1), partitions).map(f).reduce(add) File "/usr/local/spark/python/pyspark/rdd.py", line 759, in reduce vals = self.mapPartitions(func).collect() File "/usr/local/spark/python/pyspark/rdd.py", line 723, in collect bytesInJava = self._jrdd.collect().iterator() File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__ File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o26.collect. : org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up. at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
Someone is facing the same problem. Plz will help with this.
source share