No, volatile not harmful. In any situation. Ever. It is impossible to create a well-formed piece of code that breaks with the addition of volatile to the object (and pointers to this object). However, volatile often poorly understood . The reason the docs kernel claims that volatile should be considered harmful is because people constantly used it to synchronize between kernel threads. In particular, they used volatile integer variables, as if access to them was guaranteed to be atomic, but this is not so.
volatile also not useless, and especially if you go bare metal, you need it. But, like any other tool, it is important to understand the semantics of volatile before using it.
What is volatile
Access to volatile objects in the standard is considered a side effect in the same way as incrementing or decreasing by ++ and -- . In particular, this means that 5.1.2.3 (3), which states that
(...) An actual implementation should not evaluate part of an expression if it can infer that its value is not used and that necessary side effects are not created (including those caused by a function call or access to a mutable object)
not applicable. The compiler should throw out everything that it thinks about the value of the volatile variable at each point in the sequence. (like other side effects when accessing volatile objects using sequence points)
The effect of this is largely a prohibition of certain optimizations. Take for example the code
int i; void foo(void) { i = 0; while(i == 0) {
The compiler is allowed to do this in an infinite loop, which never checks i again, because it can infer that the value of i does not change in the loop, and therefore i == 0 will never be false. This is true even if there is another thread or interrupt handler that could apparently change i . The compiler does not know about them, and anyway. I obviously have nothing to worry about.
Contrast this with
int volatile i; void foo(void) { i = 0; while(i == 0) {
Now the compiler should assume that i can change at any time and cannot do this optimization. This means, of course, that if you are dealing with interrupt handlers and threads, volatile objects are needed for synchronization. However, they are not sufficient.
What volatile not
What volatile does not guarantee is atomic access. This should make an intuitive sense if you are used to embedded programming. Consider the following code snippet for an 8-bit AVR MCU if you want:
uint32_t volatile i; ISR(TIMER0_OVF_vect) { ++i; } void some_function_in_the_main_loop(void) { for(;;) { do_something_with(i);
The reason this code is broken is because access to i not atomic - it cannot be atomic on an 8-bit MCU. In this simple case, for example, the following may happen:
i 0x0000ffffdo_something_with(i) will be called- the top two bytes
i copied to the parameter slot for this call - at this moment, timer 0 overflows and interrupts the main loop
- ISR
i changes. The bottom two bytes of overflow i and now 0 . i now 0x00010000 . - the main loop continues, and the bottom two bytes
i copied to the parameter slot do_something_with is called with parameter 0 .
Similar things can happen on PCs and other platforms. In any case, more possibilities it may not open with a more complex architecture.
Removal
Thus, no, using volatile not bad, and you (often) will have to do this in code without codes. However, when you use it, you should keep in mind that this is not a magic wand, and you still have to make sure that you are not traveling on your own. In embedded code, there is often a platform-specific way to solve the atomicity problem; in the case of AVR, for example, the usual scrap method is to disable interruptions for a duration, as in
uint32_t x; ATOMIC_BLOCK(ATOMIC_RESTORESTATE) { x = i; } do_something_with(x);
... where the ATOMIC_BLOCK macro calls cli() (disable interrupts) before and sei() (enable interrupts) after they were turned on in advance.
With C11, which is the first C standard that explicitly recognizes the existence of multithreading, a new family of atomic types and memory fencing operations has been introduced that can be used for cross-thread synchronization and, in many cases, do not use volatile . If you can use them, do it, but it is likely to take some time before they reach all the common built-in toolchains. With them, the loop above can be fixed as follows:
atomic_int i; void foo(void) { atomic_store(&i, 0); while(atomic_load(&i) == 0) {
... in its most basic form. The exact semantics of the more relaxed semantics of the memory order are beyond the scope of the SO answer, so I will stick to the standard sequentially consistent material here.
If you're interested in this, Gil Hamilton provided a link in the comments to explaining the implementation of the lock stack using C11 atomicity, although I don't feel like this is an awfully good record of the memory order of semantics. However, the C11 model seems to closely reflect the C ++ 11 memory model, from which a useful view exists here . If I find a link to a C11-specific entry, I will put it here later.