Since you are using python 3.3, I recommend a stdlib solution that you will not find in the @ njzk2 linked thread: concurrent.futures .
This is a higher level of interaction than just working with threading or multiprocessing primitives. You get the Executor interface for pool processing and asynchronous reporting.
There is an example in the docs that is mostly directly applicable to your situation, so I'll just post it here:
import concurrent.futures import urllib.request URLS =
You can replace urllib.request calls with requests calls if you want. For obvious reasons, I like requests more.
The API looks something like this: create a bunch of Future objects that represent the asynchronous execution of your function. Then you use concurrent.futures.as_completed to give you an iterator over Future instances. He will give them as they are completed.
Regarding your question:
In addition, is there a rule for determining the optimal number of threads depending on the number of requests, is there?
Rule No. It depends on too many things, including the speed of your internet connection. I will say that this does not depend on the number of requests that you have, more on the equipment on which you work.
Fortunately, itβs pretty easy to max_workers max_workers kwarg and check for yourself. Start with 5 or 10 threads, increase them in steps of 5. At some point you will probably notice buoyancy of performance, and then start to decrease, since the overhead of adding additional threads exceeds the maximum payoff of increased parallelization (which is a word).
source share