Python ThreadPool from multiprocessing.pool cannot use all processors

Question

Python ThreadPool from multiprocessing.pool cannot use all processors

I have a string processing job in Python. And I want to speed things up using the thread pool. The task of processing strings is independent of each Other. The result will be saved in the mongodb database.

I wrote my code as follows:

thread_pool_size = multiprocessing.cpu_count()
pool = ThreadPool(thread_pool_size)
for single_string in string_list:
    pool.apply_async(_process, [single_string ])
pool.close()
pool.join()

def _process(s):
    # Do staff, pure python string manipulation.
    # Save the output to a database (pyMongo).

I am trying to run code on a Linux machine with 8 processor cores. And it turns out that the maximum CPU usage can be about 130% (reading from above) when I complete the task in a few minutes.

Is my approach to using thread pool correct? Is there a better way to do this?

+4

python multithreading

Ivor zhou Apr 28 '15 at 4:22

source share

2 answers

101 · Answer 1 · 2015-04-28T04:42:52+0000

, _process ; , . , , , :

def _process(s):
    for i in xrange(100000000):
        j = i * i

RaJa · Answer 2 · 2015-04-28T08:15:05+0000

. - . , Python (- ). .

Python ThreadPool from multiprocessing.pool cannot use all processors

More articles: