It is just up to the time. Windows should spawn 4 processes in the Pool , which must then be started, initialized, and prepared for use from Queue . On Windows, this requires that each child process re-import the __main__ module, and for Queue instances that were internally used by Pool , for each child. This takes a non-trivial amount of time. In fact, long enough when both of your map_async() calls are executed before all processes in the Pool are up and running. You can see this if you add some trace of the function performed by each worker in Pool :
while maxtasks is None or (maxtasks and completed < maxtasks): try: print("getting {}".format(current_process())) task = get()
Conclusion:
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> process id = 5145 getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> process id = 5145 getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> result = [121] result1 = [100] getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)> getting <ForkServerProcess(ForkServerPoolWorker-3, started daemon)> getting <ForkServerProcess(ForkServerPoolWorker-4, started daemon)> got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)>
As you can see, Worker-1 starts and consumes both tasks before workers 2-4 try to consume from Queue . If you add a sleep call after creating the Pool instance in the main process, but before calling map_async you will see that different processes process each request:
getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)> getting <ForkServerProcess(ForkServerPoolWorker-3, started daemon)> getting <ForkServerProcess(ForkServerPoolWorker-4, started daemon)> # <sleeping here> got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> process id = 5183 got <ForkServerProcess(ForkServerPoolWorker-2, started daemon)> process id = 5184 getting <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> getting <ForkServerProcess(ForkServerPoolWorker-2, started daemon)> result = [121] result1 = [100] got <ForkServerProcess(ForkServerPoolWorker-3, started daemon)> got <ForkServerProcess(ForkServerPoolWorker-4, started daemon)> got <ForkServerProcess(ForkServerPoolWorker-1, started daemon)> got <ForkServerProcess(ForkServerPoolWorker-2, started daemon)>
(Note that the extra "getting / "got" statements you see are sent to each process to gracefully close them).
Using Python 3.x on Linux, I can reproduce this behavior using the 'spawn' and 'forkserver' , but not the 'fork' . Presumably, because the superposition of the child processes is much faster than their propagation and re-import of __main__ .