I want to use subprocesses so that 20 instances of the written script are executed in parallel. Suppose I have a large list of URLs with 100,000 entries, and my program should control that all 20 instances of my script work on this list. I wanted to encode it like this:
urllist = [url1, url2, url3, .. , url100000] i=0 while number_of_subproccesses < 20 and i<100000: subprocess.Popen(['python', 'script.py', urllist[i]] i = i+1
My script just writes something to a database or text file. It does not output anything and does not need more input than the URL.
My problem is that I could not find something to get the number of active subprocesses. I am a beginner programmer, so every hint and suggestion is welcome. I am also wondering how I can control it when 20 subprocesses are loaded, what while loop checks conditions again? I was thinking that maybe I would put another loop over it, something like
while i<100000 while number_of_subproccesses < 20: subprocess.Popen(['python', 'script.py', urllist[i]] i = i+1 if number_of_subprocesses == 20: sleep()
Or maybe there is a chance that the while loop always checks the number of subprocesses?
I also considered the possibility of using the multiprocessing module, but it was very convenient for me to simply call script.py with a subprocess instead of a function with multiprocessing.
Maybe someone can help me and lead me in the right direction. Thanks Alo!
source share