Dead end in concurrent.futures code

I tried to map some code using concurrent.futures.ProcessPoolExecutor , but I had weird deadlocks that didn't occur with ThreadPoolExecutor . Minimal example:

 from concurrent import futures def test(): pass with futures.ProcessPoolExecutor(4) as executor: for i in range(100): print('submitting {}'.format(i)) executor.submit(test) 

In python 3.2.2 (on 64-bit Ubuntu) this seems to freeze all the time after sending all jobs - and this seems to happen whenever the number of jobs is greater than the number of workers. If I replaced ProcessPoolExecutor with ThreadPoolExecutor , it works flawlessly.

As an exploration attempt, I gave each future a callback to print the value of i :

 from concurrent import futures def test(): pass with futures.ProcessPoolExecutor(4) as executor: for i in range(100): print('submitting {}'.format(i)) future = executor.submit(test) def callback(f): print('callback {}'.format(i)) future.add_done_callback(callback) 

This confused me even more - the value i printed by callback is the value at the time it was called, and not at the time it was defined (so I never see callback 0 , but I get a lot of callback 99 s). Again, ThreadPoolExecutor displays the expected value.

Wondering if this could be a bug, I tried the latest version of python. Now the code at least ends, but I still get the wrong value of i .

So can someone explain:

  • what happened to the ProcessPoolExecutor between python 3.2 and the current version of dev, which apparently fixed this deadlock

  • why the "wrong" value i is printed

EDIT: as Yukevich pointed out, of course, printing i will print the value during the callback call, I don’t know what I thought ... if I pass the called object with the value of of i as one of its attributes, it works as expected.

EDIT: a bit more info: all callbacks are executed, so it looks like executor.shutdown (called by executor.__exit__ ), which cannot say that the processes are finished. This seems to be fully fixed in current python 3.3, but there seems to have been a lot of changes to multiprocessing and concurrent.futures , so I don't know what is fixed. Since I cannot use 3.3 (it does not seem to be compatible with either the version or the numpy version), I tried just copying the multiprocessor and parallel packages to my 3.2 installation, which seems to work fine. However, it seems a little strange that, as far as I see, ProcessPoolExecutor completely broken in the latest version, but no one was hurt.

+6
source share
1 answer

I modified the code as follows, which solved both problems. The callback function was defined as a closure, so it will use the updated value of i each time. As for the dead end, this may cause the Contractor to close before the entire task is completed. Waiting for the completion of the futures, this too.

 from concurrent import futures def test(i): return i def callback(f): print('callback {}'.format(f.result())) with futures.ProcessPoolExecutor(4) as executor: fs = [] for i in range(100): print('submitting {}'.format(i)) future = executor.submit(test, i) future.add_done_callback(callback) fs.append(future) for _ in futures.as_completed(fs): pass 

UPDATE: Oh, sorry, I did not read your updates, it seems they have already been resolved.

+2
source

Source: https://habr.com/ru/post/908052/


All Articles