ARM assembly: auto-increase register in storage

Is it possible to automatically increase the register base address by STR with [Rn]! ? I looked through the documentation, but could not find the final answer, mainly because the command syntax is presented for both LDR and STR - theoretically it should work for both, but I could not find examples of auto-execution in the store (loading work in order).

I made a small program that stores two numbers in a vector. When this is done, the out content should be {1, 2} , but the repository overwrites the first byte, as if the automatic increment were not working.

 #include <stdio.h> int main() { int out[]={0, 0}; asm volatile ( "mov r0, #1 \n\t" "str r0, [%0]! \n\t" "add r0, r0, #1 \n\t" "str r0, [%0] \n\t" :: "r"(out) : "r0" ); printf("%d %d\n", out[0], out[1]); return 0; } 

EDIT: Although the answer was right for regular loads and stores, I found that the optimizer ruined auto-increment on vector instructions like vldm / vstm. For example, the following program

 #include <stdio.h> int main() { volatile int *in = new int[16]; volatile int *out = new int[16]; for (int i=0;i<16;i++) in[i] = i; asm volatile ( "vldm %0!, {d0-d3} \n\t" "vldm %0, {d4-d7} \n\t" "vstm %1!, {d0-d3} \n\t" "vstm %1, {d4-d7} \n\t" :: "r"(in), "r"(out) : "memory" ); for (int i=0;i<16;i++) printf("%d\n", out[i]); return 0; } 

compiled with

 g++ -O2 -march=armv7-a -mfpu=neon main.cpp -o main 

will produce gibberish at the output of the last 8 variables, because the optimizer saves the added variable and uses it for printf. In other words, out[i] is actually out[i+8] , so the first 8 printed values ​​are the last 8 of the vector, and the rest are memory cells outside the borders.

I tried with various combinations of the volatile keyword throughout the code, but the behavior only changes if I compiled with the -O0 flag or if I use a volatile vector instead of a pointer and a new one, for example

 volatile int out[16]; 
+6
source share
3 answers

To save and load you do this:

 ldr r0,[r1],#4 str r0,[r2],#4 

everything that you put at the end, 4 in this case is added to the base register (r1 in the example ldr and r2 in the example line) after the register is used for the address, but I really like it before the command ends

 unsigned int a,*b,*c; ... a = *b++; *c++ = a; 

EDIT, you need to look at the disassembly to see what happens if anything. I use the latest sourcery code or now just the sourcery lite from the instrumental graphics for mentors.

arm-none-linux-gnueabi-gcc (Sourcery CodeBench Lite 2011.09-70) 4.6.1

 #include <stdio.h> int main () { int out[]={0, 0}; asm volatile ( "mov r0, #1 \n\t" "str r0, [%0], #4 \n\t" "add r0, r0, #1 \n\t" "str r0, [%0] \n\t" :: "r"(out) : "r0" ); printf("%d %d\n", out[0], out[1]); return 0; } arm-none-linux-gnueabi-gcc str.c -O2 -o str.elf arm-none-linux-gnueabi-objdump -D str.elf > str.list 00008380 <main>: 8380: e92d4010 push {r4, lr} 8384: e3a04000 mov r4, #0 8388: e24dd008 sub sp, sp, #8 838c: e58d4000 str r4, [sp] 8390: e58d4004 str r4, [sp, #4] 8394: e1a0300d mov r3, sp 8398: e3a00001 mov r0, #1 839c: e4830004 str r0, [r3], #4 83a0: e2800001 add r0, r0, #1 83a4: e5830000 str r0, [r3] 83a8: e59f0014 ldr r0, [pc, #20] ; 83c4 <main+0x44> 83ac: e1a01004 mov r1, r4 83b0: e1a02004 mov r2, r4 83b4: ebffffe5 bl 8350 <_init+0x20> 83b8: e1a00004 mov r0, r4 83bc: e28dd008 add sp, sp, #8 83c0: e8bd8010 pop {r4, pc} 83c4: 0000854c andeq r8, r0, ip, asr #10 

so

 sub sp, sp, #8 

consists in highlighting two local ints out [0] and out [1]

 mov r4,#0 str r4,[sp] str r4,[sp,#4] 

is that they are initialized to zero, then an inline assembly appears

 8398: e3a00001 mov r0, #1 839c: e4830004 str r0, [r3], #4 83a0: e2800001 add r0, r0, #1 83a4: e5830000 str r0, [r3] 

and then printf:

 83a8: e59f0014 ldr r0, [pc, #20] ; 83c4 <main+0x44> 83ac: e1a01004 mov r1, r4 83b0: e1a02004 mov r2, r4 83b4: ebffffe5 bl 8350 <_init+0x20> 

and now it’s clear why it didn’t work. you do not declare yourself volatile. You did not give the code any reason to return to ram to get the values ​​out [0] and out [1] for printf, the compiler knows that r4 contains a value for both [0] and out [1], there is so little code in this function that he doesn't need to carve r4 and reuse it so that it uses r4 for printf.

If you change it to volatile

  volatile int out[]={0, 0}; 

Then you should get the desired result:

 83a8: e59f0014 ldr r0, [pc, #20] ; 83c4 <main+0x44> 83ac: e59d1000 ldr r1, [sp] 83b0: e59d2004 ldr r2, [sp, #4] 83b4: ebffffe5 bl 8350 <_init+0x20> 

preparation for printf is read from ram.

+5
source

The GCC built-in assembler requires that all modified registers and non-volatile variables be specified as outputs or clobbers. In the second example, GCC may also assume that the registers allocated for in and out are not changed.

The right approach:

 out_temp = out; asm volatile ("..." : "+r"(in), "+r"(out_temp) :: "memory" ); 
0
source

I found this question while searching for an answer for a similar question: How to relate I / O register. The GCC documentation for built-in assembler constants states that the + prefix in the list of input registers denotes an I / O register.

In this example, it seems to me that you would prefer to keep the original value of the out variable. However, if you want to use the post-increment ( ! ) Instruction variant, I think you should declare the parameters as read / write. The following worked on my Raspberry Pi 2:

 #include <stdio.h> int main() { int* in = new int(16); volatile int* out = new int(16); for (int i=0; i<16; i++) in[i]=i; asm volatile( "vldm %0!, {d0-d3}\n\t" "vldm %0, {d4-d7}\n\t" "vstm %1!, {d0-d3}\n\t" "vstm %1, {d4-d7}\n\t" :"+r"(in), "+r"(out) :: "memory"); for (int i=0; i<16; i++) printf("%d\n", out[i-8]); return 0; } 

Thus, the semantics of the code are understandable to the compiler: both the in and out pointers will be changed (increased by 8 elements).

Disclaimer: I do not know if ARM ABI allows you to freely block NEON registers from d0 to d7. In this simple example, this probably doesn't matter.

0
source

Source: https://habr.com/ru/post/907426/


All Articles