This code (hand):
void blinkRed(void) { for(;;) { bb[0x0008646B] ^= 1; sys.Delay_ms(14); } }
... compiled into asm code:
08000470: ldr r4, [pc, #20] ; (0x8000488 <blinkRed()+24>) // r4 = 0x422191ac 08000472: ldr r6, [pc, #24] ; (0x800048c <blinkRed()+28>) 08000474: movs r5, #14 08000476: ldr r3, [r4, #0] 08000478: eor.w r3, r3, #1 0800047c: str r3, [r4, #0] 0800047e: mov r0, r6 08000480: mov r1, r5 08000482: bl 0x80001ac <CSTM32F100C6::Delay_ms(unsigned int)> 08000486: bn 0x8000476 <blinkRed()+6>
This is normal.
But if I just changed the index of the array ( -0x400 ) ....
void blinkRed(void) { for(;;) { bb[0x0008606B] ^= 1; sys.Delay_ms(14); } }
... I do not have optimized code:
08000470: ldr r4, [pc, #24] ; (0x800048c <blinkRed()+28>) // r4 = 0x42218000 08000472: ldr r6, [pc, #28] ; (0x8000490 <blinkRed()+32>) 08000474: movs r5, #14 08000476: ldr.w r3, [r4, #428] ; 0x1ac 0800047a: eor.w r3, r3, #1 0800047e: str.w r3, [r4, #428] ; 0x1ac 08000482: mov r0, r6 08000484: mov r1, r5 08000486: bl 0x80001ac <CSTM32F100C6::Delay_ms(unsigned int)> 0800048a: bn 0x8000476 <blinkRed()+6>
The difference is that in the first case, r4 immediately loaded with the destination address ( 0x422191ac ), and then the memory is accessed using double-byte instructions, but in the second case, r4 loaded with some intermediate address ( 0x42218000 ), and then the memory is accessed with instructions 4 bytes with an offset ( +0x1ac ) to the destination address ( 0x422181ac ).
Why does the compiler do this?
I use: arm-none-eabi-g++ -mcpu=cortex-m3 -mthumb -g2 -Wall -O1 -std=gnu++14 -fno-exceptions -fno-use-cxa-atexit -fstrict-volatile-bitfields -c -DSTM32F100C6T6B -DSTM32F10X_LD_VL
bb :
__attribute__ ((section(".bitband"))) volatile u32 bb[0x00800000];
In .ld it is defined as: in the MEMORY section:
BITBAND(rwx): ORIGIN = 0x42000000, LENGTH = 0x02000000
in the SECTIONS section:
.bitband (NOLOAD) : SUBALIGN(0x02000000) { KEEP(*(.bitband)) } > BITBAND