The amount of parallelism you get is highly dependent on the workload of your application. If your requests are CPU related, you will only submit one request at a time. On the other hand, if your requests are RPC related, you can potentially serve 10 concurrent requests. However, there are two relative limits:
1. The size of the copy. The default instance of 600 MHz can only handle so many simultaneous requests before exceeding the CPU limit, overloading your instance and significantly increasing latency.
2. There is a hard limit for concurrent requests. It depends on the implementation and can be changed, but at the moment on python27 it is 8.
source share