Are indivisible operations on multiprocessor and multicore systems indivisible?

According to the headline, plus what are the limitations and gotchas.

For example, on x86 processors, alignment for most data types is optional — optimization, not requirement. This means that the pointer can be stored at an unbalanced address, which, in turn, means that the pointer can be divided by the border of the cache page.

Obviously, this can be done if you work hard enough on any processor (select specific bytes, etc.), but not so that you still expect the write operation to be indivisible.

I seriously doubt that a multi-core processor can guarantee that other cores can guarantee a consistent all-before or all-after written pointer in this situation with uneven recording-crossing the page border.

I'm right? And are there any similar errors that I did not think about?

+4
source share
5 answers

The very concept of unified memory, visible to all threads, stops working with multiple cores with separate caches. StackOverflow memory barrier issues may be of interest; let's say this one .

I think the example that illustrates the problem with the "single memory" model is this: Initially, x = y = 0.

Theme 1:

X = x; y = 1; 

Theme 2:

 Y = y; x = 1; 

Of course, there is a race condition. The second problem, in addition to the obvious state of the race, is that one of the possible results is X = 1, Y = 1. Even without optimizing the compiler (even if you write these two threads in an assembly).

+2
source

Maybe I misunderstand the example, but the problem of a "loose pointer" is the same as with single-core execution. If the base point can be partially written to memory, then different threads can see partial updates (if there is no corresponding blocking) on ​​any machine with proactive multitasking (even in a system with one processor).

You do not need to worry about the cache if you are not writing drivers for peripherals with DMA support. Modern multiprocessors are coherent, so the hardware ensures that the stream on processor A will have the same kind of memory as the stream on processor B. If the stream on A reads a memory cell that is cached on B, then the stream on A will get the correct value from cache Bs.

You need to worry about the values ​​in the registers and that the difference may not be visible, but in my opinion, involving the cache in the discussion of concurrency often just introduces unnecessary confusion.

Any operation designated “indivisible” in the ISA programming guide must reasonably remain indivisible in a multiprocessor system built with processors using this ISA or backward compatibility would break. However, this does not mean that operations that were never promised to be indivisible, but ended up in a specific processor implementation, will be inseparable in the future (for example, in a multiprocessor system).

[Edit] End of comment below

  • Everything written to the memory will be coherently visible for all threads, regardless of the number of cores (in the cache, a coherent system).
  • Everything that is written to memory non-atomically can turn out to partially read unsynchronized streams in the presence of preference (even in a single-core system).

If a pointer is written to an unrecognized address in one atomic write, then the caching coherence hardware will ensure that all threads see that it is complete or not at all. If the pointer is written non-atomically (for example, with two separate write operations), then any threads can see a partial update even in a single-core system with a true advantage.

+1
source

On x86, then the answer will be yes, if the assembler operation was a prefix blocking command, then the processor approves a hardware signal that ensures that the next instruction is atomic (in some processors, the coordinates of the caches to ensure the atomicity of the operation)

Performing an atom atom is something that compilers do not; on multiprocessor systems, operations with atomic assemblers are very expensive and are usually used to implement the locking primitives offered by the OS / C library.

No high-level language memory operations should be considered atomic. If you have multiple threads writing to the same place in shared memory, you need to use some kind of mutex / lock mechanism to avoid races.

+1
source

Is it possible to display "False" for this class?

 class Unsafe { static bool underwearOn, trousersOn; static void Main( ) { new Thread(Wait).Start( ); // Start up the busy waiter Thread.Sleep(1000); // Give it a second to start up! underwearOn = true; trousersOn = true; } static void Wait( ) { while (!trousersOn) ; // Spin until trousersOn Console.Write(underwearOn); } } 

Yes, on multi-core machines. Value types, such as bools, can be stored in machine registers, and order registers are synchronized, depending on the machine. underwearOn can be synchronized to trousersOn .

You can block assignments and while loops, but this can hurt performance. The best solution is to declare bool variables unstable. Such variables are not stored in registers.

Edit:

This is a simplified example from the presentation available in Threading Complete .

0
source

Are indivisible operations preserved indivisible on multiprocessor and multicore systems?

The concept of "indivisible" (or "atomic") has little meaning in a sequential (single-core single-threaded) system. In order to find out whether something is indivisible or not, you need an external observer, and this external observer can only be another thread, regardless of whether it is planned on the same core or on another core. Indivisible means that no external observer can observe an intermediate state. Let me recommend the book “The Art of Multi-Core Programming” for a deeper understanding of these concepts.

What you are probably asking is that seemingly indivisible operations (such as the one-line operator x = 3) are actually indivisible. The answer is no, and there is a well-known example: processing doubles in Java. The double is stored in two 32-bit words, and the JVM specification does not guarantee that operations in doubles are atomic (although they are on almost all major JVMs). Another thread may observe a state in which only one of the two words is updated. Once again, it doesn’t matter if two threads are scheduled on the same core or on different cores.

In any case, you should always rely on synchronization events (such as reading or writing to volatile variables, barriers, locks, etc.) when you want to observe shared data in a consistent state. Another way to avoid these problems is to avoid sharing at all. This is possible with purely functional or messaging languages.

0
source

Source: https://habr.com/ru/post/1303170/


All Articles