Atomic int64_t on ARM Cortex M3

Since my compiler still does not support C ++ 11 and std :: atomic, I have to implement it manually through a couple of ldrex-strex.

My question is: what is the correct way to atomically read-modify-write int64_t using ldrex and strex?

A simple solution like this does not work (one of STREXW returns 1 all the time):

volatile int64_t value; int64_t temp; do { int32_t low = __LDREXW( (uint32_t *)&value ); int32_t high = __LDREXW( ((uint32_t *)&value)+1 ); temp = (int64_t)low | ( (int64_t)high<<32); temp++; } while( __STREXW( temp, (uint32_t *)&value) | __STREXW( temp>>32, ((uint32_t *)&value)+1) ); 

I could not find anything about several consecutive LDREX or STREX instructions pointing to different addresses in the manual, but it seemed to me that this should be allowed.

Otherwise, multiple threads will not be able to change two different atomic variables in some scenarios.

+5
source share
2 answers

This will never work because you cannot embed exclusives this way. The implementation, the local exclusive Cortex-M3 monitor does not even track the address - the exclusive gradation of the reservation is the full address space - therefore, the assumption of tracking each word separately is no longer valid. However, you don’t even need to consider any implementation details, because the architecture already explicitly excludes strex :

If two STREX instructions are executed without intermediate LDREX, the second STREX returns a status value of 1. This means that:

  • Each STREX must have a preceding LDREX associated with it in this thread of execution.
  • For each LDREX, it is not necessary to have the following STREX.

Since Cortex-M3 (and ARMv7-M in general) does not have ldrexd like ARMv7-A, you will either have to use a separate lock to control all accesses to the variable, or just disable interrupts around read-modify-write. If at all possible, it would be better to redo things that do not require an atomic 64-bit type, first of all, since you will only achieve atomicity with respect to other threads on the same core anyway - you simply cannot make 64 from the point of view of an external agent such as a DMA controller.

+4
source

I will just see how gcc does this, and use the same sequence of commands.

gcc 4.8.2 claims to implement std::atomic<int64_t> with is_lock_free() returning true, even with -mcpu=cortex-m3 . Unfortunately, this does not actually work. This makes code that is not connected or not working, because there is no implementation of helper functions that it is trying to use . (Thanks @Notlikethat for this.)

Here is the test code I tried . See the old version of this answer if this link is dead. I leave this answer in case this idea is useful to everyone in related cases when gcc makes useful code.

+1
source

Source: https://habr.com/ru/post/1244367/


All Articles