I have a string processing job in Python. And I want to speed things up using the thread pool. The task of processing strings is independent of each Other. The result will be saved in the mongodb database.
I wrote my code as follows:
thread_pool_size = multiprocessing.cpu_count()
pool = ThreadPool(thread_pool_size)
for single_string in string_list:
pool.apply_async(_process, [single_string ])
pool.close()
pool.join()
def _process(s):
I am trying to run code on a Linux machine with 8 processor cores. And it turns out that the maximum CPU usage can be about 130% (reading from above) when I complete the task in a few minutes.
Is my approach to using thread pool correct? Is there a better way to do this?
source
share