What is a practical example where the storage order of the exhaust memory is different from sequential consistency?

Question

What is a practical example where the storage order of the exhaust memory is different from sequential consistency?

It is clear that sequential sequential atomic operations differ in their actual observed behavior from operations only with release-release in a valid C ++ program. Definitions are given in the C ++ standard (starting with C ++ 11) or here .

However, I have never come across a real-world example of an algorithm or data structure where the semantics of receive-release are insufficient and consistent consistency is needed.

What would be a practical example of a real-world algorithm or data structure where consistent consistency is required and memory ordering is not enough?

Note that even std::mutex does not guarantee consistent consistency .

+5

c ++ multithreading atomic c ++ 11 memory-model

Ralph tandetzky Jan 25 '17 at 18:04

source share

1 answer

Lwimsey · Accepted Answer · 2017-01-25T19:21:11+0000

Peterson's algorithm is an example of what requires consistent consistency.

In the days leading up to the mutexes, an algorithm was used to ensure access of the thread to the protected area. The algorithm works with only two streams, each of which controls a flag that expresses the intention to access the protected area. If both set the flag to (approximately) at the same time, both will be delayed and try again. The real algorithm is more advanced, since it uses the "turn" flag to control fair access, but in order to show the difference between seq / cst and acq / rel, this is not necessary.

The following is a ready-to-compile simplified version of the Peterson algorithm, which actually shows that the algorithms are broken if weaker sequential consistency is used. It is interesting to note that this is broken even on X86, as this platform allows reordering boot loads.
The problem with reordering the load in the store is that both streams can express their intention to access the protected area by setting the me flag to true , while both read false from the him flag (since the value was not applied to both streams yet) and enter protected area. This is not possible with consistent consistency.

With gcc I had to compile with -O3 optimization to have an assert fire, with clang , which was not needed. Both compilers use a different approach to implement consistent consistency.

 #include <thread> #include <atomic> #include <assert.h> std::atomic<bool> flag1{false}; std::atomic<bool> flag2{false}; std::atomic<int> counter{0}; // Change these to memory_order_seq_cst to fix the algorithm static const auto store_ordering = std::memory_order_release; static const auto load_ordering = std::memory_order_acquire; void busy(int n) { auto &me = (n==1) ? flag1 : flag2; auto &him = (n==1) ? flag2 : flag1; for (;;) { for (;;) { me.store(true, store_ordering); if (him.load(load_ordering) == false) { // got the 'lock' break; } // retention, no wait period -> busy loop me.store(false, store_ordering); } int tmp = counter.fetch_add(1, std::memory_order_relaxed); assert(tmp == 0); /* * critical area */ tmp = counter.fetch_sub(1, std::memory_order_relaxed); assert(tmp == 1); me.store(false, store_ordering); } } int main() { std::thread t1{busy, 1}; std::thread t2{busy, 2}; t1.join(); t2.join(); }

What is a practical example where the storage order of the exhaust memory is different from sequential consistency?

More articles: