I need help with parallel processing, which I am trying to do as soon as possible.
It just involves splitting a large data array into smaller pieces and running the same script on each fragment.
I think this is called uncomfortably parallel.
I would be very grateful if someone out there could offer a template to achieve this, using either the amazon cloud services or picloud.
I made initial raids on amazon ec2 and picloud (the script I will run every piece of data in python), but understand that I can not figure out how to do this without any help.
So any pointers would be greatly appreciated. I'm just looking for basic help (for those who know), for example, the basic steps involved in setting up parallel cores or processors using ec2 or picloud or something else, using a script in parallel and saving the script, i.e. The script writes the result of its calculation to the csv file.
I am running ubuntu 12.04, my python 2.7 script does not include libraries without a stand, just os and csv. The script is not complicated, just the data is too large for my machine and timeframe.
source share