Around March 2011, I tested GAE (the Java version) as a potential platform for mass parallel computing. The date is relevant because GAE is constantly evolving. I found that the application throttles efficiently with a computational throughput of about 43.2X. Has anyone successfully used GAE for mass parallel computing or got a much higher computational coefficient? . For the purposes of this question, I arbitrarily define mass parallel computing to have a value of more than 1000x of computational throughput.
I used a desktop client that created multiple threads to access a URL. I used the GAE task queue. The application required very little input and produced very few results, be it Datastore or HTML, because it is designed to evaluate computational throughput.
Since it is often recommended that you save GAE tasks for less than 1 second (although it is unclear whether this recommendation applies to Task Queue tasks), I tried various permutations. Some of my results are included here. As you can see, even with 0.8 second tasks compatible with the recommendation of the 1st second, the throughput reached a maximum of 43.2 times.
Elapsed Tasks SecondsOf Total Gain Seconds Requested WorkPerTask Work FLT (FEW LARGE TASKS) 15 72 1 72 4.9 103 72 20 1440 14.0 1524 72 400 28800 18.9 MST (MANY SMALL TASKS) 53 1000 0.8 800 15.1 63 2000 0.8 1600 25.4 127 4000 0.8 3200 25.2 313 4000 0.8 3200 10.2 258 8000 0.8 6400 24.8 177 8000 0.8 6400 36.2 (Have 5% of tasks do nothing.) 49 2000 0.8 1600 32.7 (Have 1% of tasks do nothing.) 37 2000 0.8 1600 43.2 (Have 5% of tasks do nothing.) 42 2000 0.8 1600 38.1 (Have 10% of tasks do nothing.) 249 2000 0.8 1600 6.4 (Have 50% of tasks do nothing.) MLT (MANY LARGE TASKS) 6373 1000 200 200000 31.4 380 200 60 12000 31.6
Please note that it was not practical to spend more than 600 seconds for Task Queue tasks, so the highest I sent was 400 seconds to leave a margin of safety. Cases where some tasks did not do anything were to reduce the average amount of work each task had to do in order to affect Googleβs overall accounting. Thus, each of the 2000 tasks has 0.8 seconds of work, but an additional 222 tasks have no work, and 10% have no work.
Edit: @PeterRecore, I am measuring the bandwidth coefficient, and this is totalWorkInSeconds divided by elapsedTimeInSeconds, and this is being measured at the client. The client makes requests and measures the elapsed time until all GAE tasks are completed, as indicated by each sending a trivially small response. I am trying to figure out if GAE can be used in its current form to create an application that reaches high throughput values. In March 2011, this seemed unlikely. How about today? and how would it be done or how did you actually do it? What level of throughput was achieved? As I said, the use of Datastore is minimal and consists of each task that writes one trivially small object when the task is executed. Each task executes a loop with an integer proportional to the seconds of OfWorkPerTask. Consolidating instances of GAE are part of the problem. Google like this exacerbates this problem by telling people that they prefer secondary tasks. The problem is solved if I have big tasks, because then the instance represents a smaller percentage of the number of cycles used.
source share