Note : I "fell asleep" in the ground multiprocessing 2 days ago. Therefore, my understanding is very simple.
I am writing and loading application in amazon s3 buckets. If the file size is larger ( 100mb ), Ive implemented parallel downloads using pool from the multiprocessing module. I am using a machine with core i7 , I had cpu_count of 8 . I got the impression that if I do pool = Pool(process = 6) , I use 6 cores and the file starts to load in parts, and the download for the first 6 parts starts at the same time. To find out what happens when the process larger than cpu_count , I entered 20 (meaning that I want to use 20 cores). To my surprise, instead of getting a block of errors, the program started loading 20 parts at a time (I used a smaller chunk size to make sure there are many parts). I do not understand this behavior. I only have 8 cores, so how can it not accept input program 20? When I say process=6 , does it really use 6 threads? Which may be the only explanation that 20 is a valid input, as there may be 1000 threads. Can someone please explain this to me.
Edit:
I "borrowed" the code here . I changed it a bit and I will ask the user to use it for main use instead of setting parallel_processes to 4
source share