How can I access a shared dictionary with multiprocessing?

Question

How can I access a shared dictionary with multiprocessing?

I think I am following python documentation correctly, but it's hard for me to get the result I'm looking for. I basically have a list of numbers that are passed to the function of nested loops, and the output is stored in a dictionary.

Here is the code:

from multiprocessing import Pool, Manager list = [1,2,3,10] dictionary = {} def test(x, dictionary): for xx in range(100): for xxx in range(100): dictionary[x]=xx*xxx if __name__ == '__main__': pool = Pool(processes=4) mgr = Manager() d = mgr.dict() for N in list: pool.apply_async(test, (N, d)) # Mark pool as closed -- no more tasks can be added. pool.close() # Wait for tasks to exit pool.join() # Output results print d

Here's the expected result:

 {1: 9801, 2: 9801, 3: 9801, 10: 9801}

Any suggestions on what I'm doing wrong? Also, I have not convinced myself that sharing resources is a better approach (thinking about using a database to maintain state), so if my approach is completely messed up or is there a better way to do this in python, please let me know.

+6

python

Lostsoul Feb 13 '12 at 5:55

source share

1 answer

Eli bendersky · Accepted Answer · 2012-02-13T06:02:48+0000

Change the definition of test to:

 def test(x, d): for xx in range(100): for xxx in range(100): d[x]=xx*xxx

Otherwise, you simply increase some global dictionary (without synchronization) and you will never get access to it later.

As for the general approach, I think that in this, in particular, there is a lot of controversy regarding the general vocabulary. Do you really need to update it from each process as soon as this happens? The accumulation of batches of partial results in each process and just updating the shared object once in a while should do better.

How can I access a shared dictionary with multiprocessing?

More articles: