Instead, I would use the Amazon Elastic MapReduce framework. You can dynamically move machines and clusters up and down, and you donโt have to worry about setting them up to talk to each other.
http://aws.amazon.com/elasticmapreduce/
It is used by many people, and it is mostly reliable. This saves you the absolute TON of work typically done by setting up and administering a cluster. Just one difference from a regular howop is that itโs best to put things in S3 instead of HDFS (since the clusters are transient, so the HDFS data disappears with the cluster).
source share