Is there a race condition in the latch pattern in the N3600?

Proposed for inclusion in C ++ 14 (aka C ++ 1y) are some of the new stream synchronization primitives: latches and barriers. Sentence

It sounds good, and the samples make it very convenient for programmers. Unfortunately, I think the sample code causes undefined behavior. The statement says latch::~latch() :

Destroys the latch. If the latch is destroyed and other threads are in wait() or call count_down() , the behavior is undefined.

Note that it says "in wait() " and not "locked in wait() ", as described in the description of count_down() .

Then the following example is provided:

The following is an example of a second use case. We need to load the data and then process it using multiple threads. Data loading is associated with I / O binding, while the initial flows and the creation of data structures are associated with the CPU. Through parallel operation, throughput can be increased.

 void DoWork() { latch start_latch(1); vector<thread*> workers; for (int i = 0; i < NTHREADS; ++i) { workers.push_back(new thread([&] { // Initialize data structures. This is CPU bound. ... start_latch.wait(); // perform work ... })); } // Load input data. This is I/O bound. ... // Threads can now start processing start_latch.count_down(); } 

Is there a discrepancy between threads waking up and coming back from wait() and breaking a latch when leaving an area? In addition, all thread objects are leaking. If the scheduler does not start all worker threads before count_down returns and the start_latch object leaves scope, then I think this will lead to undefined behavior. Presumably, the fix is ​​to repeat the vector and join() and delete all worker threads after count_down , but before returning.

  • Is there a problem with the sample code?
  • Do you agree that the proposal should contain a complete correct example, even if the task is extremely simple so that reviewers can see how the experience will look?

Note. It seems possible that one or more worker threads have not yet begun to wait, and therefore will call wait() on the destroyed latch.


Update: Now a new version of the proposal has appeared, but the representative example has not changed.

+6
source share
1 answer

Thanks for pointing this out. Yes, I think that the code sample (which in its defense was intended for brevity) is broken. Probably, he should wait for the threads to finish.

Any implementation that allows you to block threads in wait () will almost certainly include some condition variable and kill the latch before the thread exits wait (), potentially undefined.

I do not know if there is time for updating the paper, but I can make sure that the next version is fixed.

Alasdair

+4
source

Source: https://habr.com/ru/post/943729/


All Articles