To complete the Amplab exercises, I created a key pair on us-east-1 , installed training scripts ( git clone git://github.com/amplab/training-scripts.git -b ampcamp4 ) and created env. variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY following the instructions in http://ampcamp.berkeley.edu/big-data-mini-course/launching-a-bdas-cluster-on-ec2.html
Now running
./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1 -k myspark --copy launch try1
generates the following messages:
johndoe@ip-some-instance :~/projects/spark/training-scripts$ ./spark-ec2 -i ~/.ssh/myspark.pem -r us-east-1 -k myspark --copy launch try1 Setting up security groups... Searching for existing cluster try1... Latest Spark AMI: ami-19474270 Launching instances... Launched 5 slaves in us-east-1b, regid = r-0c5e5ee3 Launched master in us-east-1b, regid = r-316060de Waiting for instances to start up... Waiting 120 more seconds... Copying SSH key /home/johndoe/.ssh/myspark.pem to master... ssh: connect to host ec2-54-90-57-174.compute-1.amazonaws.com port 22: Connection refused Error connecting to host Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com 'mkdir -p ~/.ssh'' returned non-zero exit status 255, sleeping 30 ssh: connect to host ec2-54-90-57-174.compute-1.amazonaws.com port 22: Connection refused Error connecting to host Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com 'mkdir -p ~/.ssh'' returned non-zero exit status 255, sleeping 30 ... ... subprocess.CalledProcessError: Command 'ssh -t -o StrictHostKeyChecking=no -i /home/johndoe/.ssh/myspark.pem root@ec2-54-90-57-174.compute-1.amazonaws.com '/root/spark/bin/stop-all.sh'' returned non-zero exit status 127
where root@ec2-54-90-57-174.compute-1.amazonaws.com is the user and master instance. I tried -u ec2-user and incremented -w to 600, but got the same error.
I can see the master and slave instances in us-east-1 when I log in to the AWS console, and I can actually ssh into the Master instance from the local ip-some-instance shell.
I understand that the spark-ec2 script takes care of defining the Master / Slave security groups (which ports are listening, etc.), and I do not need to configure these settings. This indicates that the master and subordinates are listening to message 22 ( Port:22, Protocol:tcp, Source:0.0.0.0/0 in ampcamp3-slaves / master groups).
I am in trouble here, and would be grateful for any pointers before spending all my R&D funds on EC2 instances .... Thanks.
source share