Python threads are not going to garbage

Here is my stream setup. On my machine, the maximum number of threads is 2047.

class Worker(Thread): """Thread executing tasks from a given tasks queue""" def __init__(self, tasks): Thread.__init__(self) self.tasks = tasks self.daemon = True self.start() def run(self): while True: func, args, kargs = self.tasks.get() try: func(*args, **kargs) except Exception, e: print e self.tasks.task_done() class ThreadPool: """Pool of threads consuming tasks from a queue""" def __init__(self, num_threads): self.tasks = Queue(num_threads) for _ in range(num_threads): Worker(self.tasks) def add_task(self, func, *args, **kargs): """Add a task to the queue""" self.tasks.put((func, args, kargs)) def wait_completion(self): """Wait for completion of all the tasks in the queue""" self.tasks.join() 

In other classes of my module, I call the ThreadPool class from above to create a new thread pool. Then I perform the operations. Here is an example:

 def upload_images(self): '''batch uploads images to s3 via multi-threading''' num_threads = min(500, len(pictures)) pool = ThreadPool(num_threads) for p in pictures: pool.add_task(p.get_set_upload_img) pool.wait_completion() 

The problem I am facing is that these threads are not garbage collected.

After several runs, here is my mistake:

File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 495, at the beginning of _start_new_thread (self .__ bootstrap, ()) thread.error: cannot start a new flow

This means that I hit a flow limit of 2047.

Any ideas? Thanks.

+4
source share
2 answers

Your work thread never returns from run , so your thread never ends.

Maybe something like the following for your run method?

 def run(self): while True: try: func, args, kargs = self.tasks.get() except Queue.Empty: break try: func(*args, **kargs) except Exception, e: print e self.tasks.task_done() 
+4
source
 def run(self): while True: func, args, kargs = self.tasks.get() try: func(*args, **kargs) except Exception, e: print e self.tasks.task_done() 

it looks like an endless cycle, maybe this is the reason? all threads are alive, so they cannot be collected.

+1
source

Source: https://habr.com/ru/post/1500480/


All Articles