Intel 64 and IA-32 | Atomic operations, including semantics of receipt / release

Question

Intel 64 and IA-32 | Atomic operations, including semantics of receipt / release

According to the Intel 64 and IA-32 Software Development Guide Architects LOCK Signal Prefix, "it guarantees that the processor has the exclusive use of any shared memory when it emits a signal." This may be in the form of a bus or cache lock.

But ... and that the reason I ask this question is not clear to me whether this prefix can provide any memory barrier.

I am developing NASM in a multiprocessor environment and need to implement atomic operations with optional semantics of receipt and / or release.

So, do I need to use the instructions MFENCE, SFENCE and LFENCE or will it be redundant?

+5

assembly x86 locking intel memory-fences

0xbadf00d Jan 27 '11 at 6:14

source share

2 answers

Not. From IA32 Guides (Volume 3A, Chapter 8.2: Ordering Memory):

Reading or writing cannot be reordered with I / O instructions , blocked instructions or serialization instructions.

Therefore, a lock instruction is not needed with locked instructions.

+3

etherice May 31 '13 at 4:25

source share

Gj. · Accepted Answer · 2011-01-27T10:08:35+0000

No, there is no need to use the instructions MFENCE, SFENCE and LFENCE against prefix LOCK .

MFENCE, SFENCE and LFENCE instruction guarantees memory visibility in all CPU cores. For example, the MOV command cannot be used with the LOCK prefix, therefore, to make sure that the result of the memory move is visible to all CPU cores, we must be sure that the CPU cache will turn red in RAM and that we will reach the instruction fence.

EDIT: more about blocked atomic operations from the Intel manual:

LOCKED ATOMIC OPERATIONS
32-bit IA-32 processors support blocked atomic operations on locations in the Memory system. Typically, these operations are used to manage common data structures (for example, semaphores, descriptor segments, system segments, or table pages) in which two or more processors can simultaneously change the same field or flag. The processor uses three interdependent mechanisms to block atomic operations:
• Guaranteed atomic operations
• Bus lock using LOCK # signal and LOCK instruction prefix
• Cache coherence protocols, ensuring atomic operations can be performed in cached data structures (cache lock); this mechanism is present in the Pentium 4, Intel Xeon, and P6 family of processors.
These mechanisms are interdependent in the following ways. Some basic memory transactions (such as reading or writing a byte to system memory) are always guaranteed atomically. That is, as soon as it starts, the processor guarantees that the operation will be performed before another processor or bus agent is allowed access to memory. The processor also supports the lock bus to perform selected memory operations (for example, a read-modify-write operation in a common memory area) that usually need to be processed atomically, but not automatically processed in this way. Because frequently used memory locations are often cached into L1 or L2 processors, atoms of an operation can often be performed inside processors cached without claiming to lock the bus. Here, the processor cache coherence protocols make sure that other processors that cache the same memory locations are properly managed; operations are performed in the memory cache.

Intel 64 and IA-32 | Atomic operations, including semantics of receipt / release

More articles: