Peterson's algorithm is an example of what requires consistent consistency.
In the days leading up to the mutexes, an algorithm was used to ensure access of the thread to the protected area. The algorithm works with only two streams, each of which controls a flag that expresses the intention to access the protected area. If both set the flag to (approximately) at the same time, both will be delayed and try again. The real algorithm is more advanced, since it uses the "turn" flag to control fair access, but in order to show the difference between seq / cst and acq / rel, this is not necessary.
The following is a ready-to-compile simplified version of the Peterson algorithm, which actually shows that the algorithms are broken if weaker sequential consistency is used. It is interesting to note that this is broken even on X86, as this platform allows reordering boot loads.
The problem with reordering the load in the store is that both streams can express their intention to access the protected area by setting the me flag to true , while both read false from the him flag (since the value was not applied to both streams yet) and enter protected area. This is not possible with consistent consistency.
With gcc I had to compile with -O3 optimization to have an assert fire, with clang , which was not needed. Both compilers use a different approach to implement consistent consistency.
#include <thread> #include <atomic> #include <assert.h> std::atomic<bool> flag1{false}; std::atomic<bool> flag2{false}; std::atomic<int> counter{0}; // Change these to memory_order_seq_cst to fix the algorithm static const auto store_ordering = std::memory_order_release; static const auto load_ordering = std::memory_order_acquire; void busy(int n) { auto &me = (n==1) ? flag1 : flag2; auto &him = (n==1) ? flag2 : flag1; for (;;) { for (;;) { me.store(true, store_ordering); if (him.load(load_ordering) == false) { // got the 'lock' break; } // retention, no wait period -> busy loop me.store(false, store_ordering); } int tmp = counter.fetch_add(1, std::memory_order_relaxed); assert(tmp == 0); /* * critical area */ tmp = counter.fetch_sub(1, std::memory_order_relaxed); assert(tmp == 1); me.store(false, store_ordering); } } int main() { std::thread t1{busy, 1}; std::thread t2{busy, 2}; t1.join(); t2.join(); }
source share