The problem with volatile write on x86 is that it gives out a complete memory barrier, which causes it to stop until the storage buffer is exhausted. Meanwhile, lazySet on x86 is a simple store. It does not require that all previous stores waiting in the storage buffer be flushed, which allows recording the stream at full speed.
This is described a bit in an article by Martin Thompson .
source share