This code can definitely show false sharing if the compiler decided to implement it that way. But that would be stupid for the compiler.
In the first loop, each thread accesses only one sum element. There is no reason num_steps writes to the actual stack the stack holding this element; it is much faster to just store the value in a register and write it after the for loop completes. Since the array is not mutable or atomic, there is nothing stopping the compiler from behaving this way.
And, of course, in the second loop there is no write to the array, so there is no exchange of lies.
source share