I have multithreaded C ++ code with the following structure:
do_thread_specific_work(); update_shared_variables();
What follows control point B is work that could begin if all flows reached only control point A, so my concept is a “soft barrier”.
Generally, libraries with multiple threads provide only “hard barriers” in which all threads must reach a certain point before they can continue. Obviously, at checkpoint B, a hard barrier can be used.
Using a soft barrier can lead to better lead times, especially since work between breakpoints A and B cannot be balanced by load between threads (i.e. 1 slow thread that reaches breakpoint A, but not B, can cause all others wait at the barrier just before checkpoint B).
I tried using atomics to synchronize things, and I know with 100% certainty, which is NOT guaranteed. For example, using the openmp syntax, before starting a parallel section:
shared_thread_counter = num_threads; //known at compile time
Then at checkpoint A:
#pragma omp atomic shared_thread_counter--;
Then at checkpoint B (using the survey):
#pragma omp flush while (shared_thread_counter > 0) { usleep(1);
I have developed several experiments in which I use an atom to indicate that some operation is before it is completed. The experiment will work with 2 threads most of the time, but fails consistently when I have many threads (e.g. 20 or 30). I suspect this is due to the caching structure of modern processors. Even if one thread updates some other value before performing an atomic decrement, it cannot be read by another thread in this order. Consider the case where another value is a cache skip, and atomic decrement is a cache.
So, back to my question, how to properly implement this "soft barrier"? Is there a built-in function guaranteeing such functionality? I would prefer openmp, but I am familiar with most of the other common multithreaded libraries.
As a workaround, now I am using a hard barrier at checkpoint B, and I changed my code to make work between checkpoint A and B automatically load balance between threads (which was quite difficult from time to time).
Thanks for any advice / understanding :)