Here is my stream setup. On my machine, the maximum number of threads is 2047.
class Worker(Thread): """Thread executing tasks from a given tasks queue""" def __init__(self, tasks): Thread.__init__(self) self.tasks = tasks self.daemon = True self.start() def run(self): while True: func, args, kargs = self.tasks.get() try: func(*args, **kargs) except Exception, e: print e self.tasks.task_done() class ThreadPool: """Pool of threads consuming tasks from a queue""" def __init__(self, num_threads): self.tasks = Queue(num_threads) for _ in range(num_threads): Worker(self.tasks) def add_task(self, func, *args, **kargs): """Add a task to the queue""" self.tasks.put((func, args, kargs)) def wait_completion(self): """Wait for completion of all the tasks in the queue""" self.tasks.join()
In other classes of my module, I call the ThreadPool class from above to create a new thread pool. Then I perform the operations. Here is an example:
def upload_images(self): '''batch uploads images to s3 via multi-threading''' num_threads = min(500, len(pictures)) pool = ThreadPool(num_threads) for p in pictures: pool.add_task(p.get_set_upload_img) pool.wait_completion()
The problem I am facing is that these threads are not garbage collected.
After several runs, here is my mistake:
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 495, at the beginning of _start_new_thread (self .__ bootstrap, ()) thread.error: cannot start a new flow
This means that I hit a flow limit of 2047.
Any ideas? Thanks.
source share