Python ThreadPool from multiprocessing.pool cannot use all processors

I have a string processing job in Python. And I want to speed things up using the thread pool. The task of processing strings is independent of each Other. The result will be saved in the mongodb database.

I wrote my code as follows:

thread_pool_size = multiprocessing.cpu_count()
pool = ThreadPool(thread_pool_size)
for single_string in string_list:
    pool.apply_async(_process, [single_string ])
pool.close()
pool.join()

def _process(s):
    # Do staff, pure python string manipulation.
    # Save the output to a database (pyMongo).

I am trying to run code on a Linux machine with 8 processor cores. And it turns out that the maximum CPU usage can be about 130% (reading from above) when I complete the task in a few minutes.

Is my approach to using thread pool correct? Is there a better way to do this?

+4
source share
2 answers

, _process ; , . , , , :

def _process(s):
    for i in xrange(100000000):
        j = i * i
+2

. - . , Python (- ). .

+1

Source: https://habr.com/ru/post/1585021/


All Articles