Relying on network I / O to ensure cross-thread synchronization in C ++

Can external I / O be used as a form of cross-thread synchronization?

To be specific, consider the pseudo-code below, which assumes the existence of network / socket functions:

int a; // Globally accessible data. socket s1, s2; // Platform-specific. int main() { // Set up + connect two sockets to (the same) remote machine. s1 = ...; s2 = ...; std::thread t1{thread1}, t2{thread2}; t1.join(); t2.join(); } void thread1() { a = 42; send(s1, "foo"); } void thread2() { recv(s2); // Blocking receive (error handling omitted). f(a); // Use a, should be 42. } 

We assume that the remote computer only sends data to s2 after receiving "foo" from s1 . If this assumption fails, then, of course, undefined behavior will occur. But if it is executed (and no other external failure occurs, like damage to network data, etc.), does this program generate certain behavior?

"Never", "indefinite (implementation dependent)", "depends on the guarantees provided by the send / recv implementation" are examples of the answers I expect, preferably with justification from the C ++ standard (or other relevant standards such as POSIX for sockets / networks).

If "never", then changing a as std::atomic<int> , initialized with a certain value (for example, 0), should avoid undefined behavior, but then is this value guaranteed to be read as 42 in thread2 or can an outdated value be read? Do POSIX sockets provide an additional guarantee to ensure that an obsolete value will not be counted?

If it does, does POSIX sockets provide an appropriate guarantee to determine its behavior? (How about s1 and s2 being the same socket instead of two separate sockets?)

For reference, the standard I / O library has a suggestion that seems to provide a similar guarantee when working with iostreams (27.2.3¶2 in N4604):

If one thread makes a call to library a, which writes the value to the stream, and, as a result, another thread reads this value from the stream through a call to library b, so that it does not lead to data race, and then how the record is synchronized with reading bs.

So, is it a matter of using a basic network library / features providing a similar guarantee?

From a practical point of view, the compiler cannot reorder calls to global a with respect to send and recv functions (since they could use a in principle). However, thread2 can still read the deprecated a value if there is no guarantee of security / memory synchronization provided by the send / recv pair itself.

+5
source share
2 answers

Short answer: No, there is no general guarantee that a will be updated. My suggestion was to send the value of a along with "foo" - for example. "foo, 42" , or something like that. This is guaranteed to work, and probably will not have significant overhead. [Of course, there may be other reasons why this works poorly]

Long rambling things that don't really answer the problem:

Global data is not guaranteed to be "visible" immediately in different cores of multicore processors without further operations. Yes, most modern processors are “consistent”, but not all models of all brands guarantee this. Therefore, if thread2 runs on a processor that has already cached a copy of a , it cannot be guaranteed that the value of a is 42 at the point when you call f .

The C ++ standard ensures that global variables are loaded after a function call, so the compiler is not allowed to do:

  tmp = a; recv(...); f(tmp); 

but, as I said above, cache operations may be required to ensure that all processors see the same value at the same time. If send and recv are long or large in access enough [there is no direct measure that says how long or large) you can see the correct value more or even all the time, but there is no guarantee for ordinary types that they are ACTUALLY updated out of stream, who wrote the value last.

std::atomic will help on some types of processors, but there is no guarantee that it is "visible" in the second thread or on the second processor core at any reasonable time after changing it.

The only practical solution is to have some kind of "repeat while I see it" code, this may require one value, which (for example) is a counter, and one value is the actual value - if you want to say that "now 42. I installed again, this time also 42." If a reproduced, for example, the number of data items available in the buffer, it probably means “he changed the value”, and just checking “this is the same as last time”. The std::atomic operations have guarantees with respect to order, which allows them to be used to ensure that "if I update this field, another field will be guaranteed to appear at the same time or before." Thus, you can use this to ensure, for example, that a pair of data items has a value of "there is a new value" (for example, a counter indicates a "version number" of current data) and a "new value is X",

Of course, if you KNOW what processor architecture your code will run on, you can make more complex guesses about what the behavior will be. For example, all x86 and many ARM processors use the cache interface to implement atomic updates for a variable, so when doing atomic updates on one core, you may know that "no other processor will have an outdated value." But there are processors available that do not have this implementation detail, and where the update, even using the atomic instruction, will not be updated on other kernels or in other threads until “for some time in the future is uncertain”.

+1
source

In general, no, external I / O cannot be used to synchronize cross-streams.

The question is the priority of C ++ itself - the standard, since it is related to the behavior of library functions / external libraries. Does the behavior of the undefined program really depend on any synchronization guarantees provided by the network I / O functions. In the absence of such guarantees, this is truly undefined behavior. Switching to (initialized) atomatics to avoid undefined behavior still does not guarantee that the “correct” updated value will be read. In order to guarantee that within the framework of the C ++ standard some locking will be required (for example, a spin lock or a mutex), even if it seems that the wait is not required due to real-time ordering of the situation.

In general, the concept of "real-time" synchronization (including visibility, not just ordering), necessary to avoid potential waiting after recv returns before loading a , is not supported by the C ++ standard. However, at a lower level, this concept really exists and is usually implemented through interprocessor interrupts , for example. FlushProcessWriteBuffers on Windows or sys_membarrier on x86 Linux. This will be inserted after saving to a before a before send to thread1 . No synchronization or barrier is required in thread2 . (It seems that a simple simple SFENCE in thread1 might suffice on x86 due to its strong memory model, at least in the absence of non-temporary loads / storages.)

Banning the compiler is not needed in any of the threads for the reasons indicated in the question (calling the external send function, which for all compilers can get an internal mutex to synchronize with another recv call).

The insidious problems described in section 4.3 of Hans Böhm's article “ Threads Cannot Be Implemented as a Library ” should not be a problem since the C ++ compiler supports the stream (and, in particular, the opaque send and recv functions may contain synchronization operations), therefore, the transformations, introducing entries in a after send in thread1 are invalid according to memory.

This leaves open the question of whether the POSIX network functions provide the necessary guarantees. I highly doubt it, since on some architectures with weak memory models they are extremely nontrivial and / or expensive to provide (requiring a system-wide mutex or IPI, as mentioned earlier). In x86 specifically, it is almost certain that access to a shared resource, such as a socket, would entail SFENCE or MFENCE (or even a LOCK provided instruction) somewhere along the line, which should be sufficient, but this is unlikely to be fixed in the standard in anywhere. Editing: in fact, I think that even INT to switch to kernel mode entails a leak in the storage buffer (the best link I should pass is the forum ).

+1
source

Source: https://habr.com/ru/post/1275630/


All Articles