What happens when i multiprocessing.pool.apply_async more than i have processors

I have the following setup:

results = [f(args) for _ in range(10**3)]

But, f(args)it takes a lot of time to calculate. Therefore, I would like to throw multiprocessing on it. I would like to do this:

pool = mp.pool(mp.cpu_count() -1) # mp.cpu_count() -> 8
results = [pool.apply_async(f, args) for _ in range(10**3)]

It’s clear that I don’t have 1000 processors on my computer, so my concern:
Does the above call cause 1000 processes simultaneously competing for the processor time or 7 processes working simultaneously, iteratively calculating the next f(args)one when the previous call ends?

I suppose I could do something like pool.async_map(f, (args for _ in range(10**3)))to get the same results, but the purpose of this post is to understand the behaviorpool.apply_async

+4
source share
2 answers

You will never have more processes than there are workers in your pool (in your case mp.cpu_count() - 1. If you call apply_asyncand all the workers are busy, the task will be queued and executed as soon as the worker is free. You can see this with a simple test program:

#!/usr/bin/python

import time
import multiprocessing as mp

def worker(chunk):
    print('working')
    time.sleep(10)
    return

def main():
    pool = mp.Pool(2)  # Only two workers
    for n in range(0, 8):
        pool.apply_async(worker, (n,))
        print("called it")
    pool.close()
    pool.join()

if __name__ == '__main__':
    main()

The output is as follows:

called it
called it
called it
called it
called it
called it
called it
called it
working
working
<delay>
working
working
<delay>
working 
working
<delay>
working
working
+7
source

The number of workflows is completely controlled by the argument mp.pool(). Therefore, if it mp.cpu_count()returns 8 in your field, 7 workflows will be created.

pool (apply_async() ) , . . , ( + ).

, ; -)

+6

Source: https://habr.com/ru/post/1539339/


All Articles