Python multiprocessing + subprocess issues

I have a binary code (say a.out ) that I want to call with different configurations. I want to run these configs on a 40-core computer in parallel. Below is a sketch of my code.

It is very simple: I generate the configuration and move on to the worker, and the worker calls the binary using the configuration using the subprocess. I also redirect the output to a file. Let me call this piece of code run.py

 def worker(cmdlist, filename): outputfile = open(filename, 'wb') // here it essentially executes a.out config > outputfile subprocess.call(cmdlist, stderr=outputfile, stdout=outputfile) outputfile.close() def main(): pool = Pool(processes = 40) for config in all_configs filename, cmdlist = genCmd(config) res = pool.apply_async(worker, [cmdlist, filename]) results.append(res) for res in results: res.get() pool.close() 

But after I let him go, I realized that I did not create as many processes as I wanted. I definitely presented more than 40 workers, but at the top, I only see about 20 of.out.

I see many of run.py that are in a "sleep" state (i.e. "S" on top). When I do ps auf , I also saw a lot of run.py in the "S +" state, with no binary files being output. Only about half of them spawned "a.out"

I wonder why this is happening? I redirect the output to the network hard drive, which may be the reason, but in the upper part I see only 10% wa (which, in my opinion, is 10% of the I / O timeout). I do not think that this leads to 50% of idle processors. Also, I should at least create a binary, instead of getting stuck in run.py My binary runtime is also quite long. I really have to see 40 jobs for a long time.

Any other explanations? Anything I did wrong in my Python code?

+4
source share
1 answer

The approach that I used to simultaneously use many simultaneous processes running on several cores at once is to use p = subprocess.Popen (...) and p.Poll (). In your case, I think you can skip using Pool allogether. I would give you a better example, but unfortunately I no longer have access to this code.

+1
source

Source: https://habr.com/ru/post/1403796/


All Articles