A few things you should do, first of all:
Modify the read and write points, but keep _lastReadIndex and _lastWrittenIndex intact to find out how much data you have, how much is lost, or possibly blocking the record if it overflows the reader after a full cycle.
And, very importantly, avoid sharing as much as possible - put reader variables and entries in separate cache lines.
Now, to your question:
If you are trying to be portable, the memory order you need in your code should not be architecture-aware. Standard atomic functions can take care of this. You only need to make sure that the data is available in the buffer before increasing the index of the record, which means the semantics of the semantics of the increment. You must also make sure that the writer writes data to memory and is not optimized to remain only in registers.
newIndex = _lastWrittenIndex+1; buffer[newIndex % bufSize] = newData; atomic_store( &_lastWrittenIndex, newIndex, memory_order_release );
On x86 / 64, this will be the same as:
newIndex = _lastWrittenIndex+1; buffer[newIndex % bufSize] = newData; // release semantics means reorder barrier before action: barrier(); // translates to `asm volatile("":::"memory");` *(volatile int*)_lastWrittenIndex = newIndex;
When writing code that refers to _lastWrittenIndex no more than is absolutely necessary, as above, you can also declare it mutable, but keep in mind that a barrier is still needed!
source share