Why is the function argument order reversed?

I experimented a bit with functions, and I found that the order of the arguments changes in memory. Why is this?

stack test.cpp:

#include <stdio.h> void test( int a, int b, int c ) { printf("%p %p %p\n", &a, &b, &c); printf("%d %d\n", *(&b - 1), *(&b + 1) ); } int main() { test(1,2,3); return 0; } 

clank:

 $ clang++ stack-test.cpp && ./a.out 0x7fffb9bb816c 0x7fffb9bb8168 0x7fffb9bb8164 3 1 

GCC:

 $ g++ stack-test.cpp && ./a.out 0x7ffe0b983b3c 0x7ffe0b983b38 0x7ffe0b983b34 3 1 

EDIT: do not duplicate: the evaluation order may differ from the memory layout, so this is another question.

+6
source share
6 answers

The calling convention is implementation dependent.

But to support the variator functions of C (in C ++, expressed using an ellipse ... in the list of formal arguments), the arguments are usually popped, or the stack space is reserved for them in the order from right to left. This is usually called the expression (1) C. In accordance with this convention and the general convention that the computer stack grows down in memory, the first argument should end at the lowest address opposite to your result.

And when I compile your program with MinGW g ++ 5.1, which is 64 bit, I get

 000000000023FE30 000000000023FE38 000000000023FE40 

And when I compile your program with 32-bit Visual C ++ 2015, I get

 00BFFC5C 00BFFC60 00BFFC64 

And both of these results are consistent with the C calling convention, unlike your result.

So the conclusion is that your default compiler uses something other than the C calling convention, at least for non-invariant functions.

You can verify this by adding ... to the end of the list of formal arguments.


1) The C calling convention also includes that it is the caller that adjusts the stack pointer when the function returns, but that doesn't matter here.

+7
source

This behavior is implementation specific.

In your case, this is because the arguments are pushed onto the stack. Here's an interesting article that shows a typical process memory layout that shows how the stack grows. Therefore, the first argument that is pushed onto the stack will have the highest address.

+7
source

The C standard (and C ++) does not define the order of arguments passed or how they should be organized in memory. A compiler developer (usually in collaboration with OS developers) should come up with something that works in a particular processor architecture.

In MOST architectures, the stack (and registers) is used to pass arguments to the function, and, again, for MOST architectures, the stack grows from β€œhigh to low” addresses, and in most C implementations, the order of the arguments is passed β€œlast to the left,” so if we have a function

  void test( int a, int b, int c ) 

then the arguments are passed in the following order:

 c, b, a 

to function.

However, which complicates this, when the value of the arguments is passed to the registers, and the code using the arguments takes the address of these arguments - the registers have no addresses, so you can not take the address of the register variable. Therefore, the compiler will generate some code to store the address on the stack [from where we can get the address of the value] locally for the function. It totally depends on the decision of the compiler, which orders that it does, and I am sure that this is what you see.

If you take your code and pass it through clang, we see:

 define void @test(i32 %a, i32 %b, i32 %c) #0 { entry: %a.addr = alloca i32, align 4 %b.addr = alloca i32, align 4 %c.addr = alloca i32, align 4 store i32 %a, i32* %a.addr, align 4 store i32 %b, i32* %b.addr, align 4 store i32 %c, i32* %c.addr, align 4 %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i32* %a.addr, i32* %b.addr, i32* %c.addr) %add.ptr = getelementptr inbounds i32, i32* %b.addr, i64 -1 %0 = load i32, i32* %add.ptr, align 4 %add.ptr1 = getelementptr inbounds i32, i32* %b.addr, i64 1 %1 = load i32, i32* %add.ptr1, align 4 %call2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i32 0, i32 0), i32 %0, i32 %1) ret void } 

Although this may not be entirely trivial to read, you can see that the first few lines of the test function:

  %a.addr = alloca i32, align 4 %b.addr = alloca i32, align 4 %c.addr = alloca i32, align 4 store i32 %a, i32* %a.addr, align 4 store i32 %b, i32* %b.addr, align 4 store i32 %c, i32* %c.addr, align 4 

This essentially creates space on the stack ( %alloca ) and stores the variables a , b and c in these places.

The assembler code generated by gcc is even less readable, but you can see a similar thing happening here:

 subq $16, %rsp ; <-- "alloca" for 4 integers. movl %edi, -4(%rbp) ; Store a, b and c. movl %esi, -8(%rbp) movl %edx, -12(%rbp) leaq -12(%rbp), %rcx ; Take address of ... leaq -8(%rbp), %rdx leaq -4(%rbp), %rax movq %rax, %rsi movl $.LC0, %edi movl $0, %eax call printf ; Call printf. 

You may wonder why it allocates space for 4 integers - this is because the stack should always be aligned with 16 bytes in x86-64.

+2
source

C (and C ++) code uses the processor stack to pass arguments to functions.

How the stack works depends on the processor. A stack can (theoretically) grow up or down. Therefore, your processor determines if addresses are growing or shrinking. Finally, not only the processor architecture is responsible for this, but there are call conventions for code running in the architecture.

Defiant conventions say how arguments should be pushed onto the stack for one particular processor architecture. Agreements are needed that libraries from different compilers can be linked.

Basically, for you, as a user of C, this usually does not matter if the addresses of the variables in the stack are increasing or decreasing.

Details:

+1
source

ABI defines how to pass parameters.

In your example, this is a bit complicated, since by default x86_64 ABI passes gcc and clang parameters to registers (*), there was no address for them.

Then you refer to the parameters, so the compiler is forced to allocate local storage for these variables, and the location of the order and memory also depends on the implementation.

  • Note. Up to 6 trivial parameters, if there are more, it passes the stack.
  • Reference: x86_64 ABI
+1
source

Speaking of 32-bit x86 Windows

Short answer: a pointer to function arguments is not a required pointer to the stack that was pressed to call the actual function, but can be anywhere when the compiler moves the variable.

Long answer: The same problem was detected when converting my code from bcc32 (Embarcadero classic compiler) to CLANG. The RPC code generated by the MIDL compiler was broken because the arguments to the RPC function passed serialized arguments, taking a pointer to the first argument of the function, assuming that all of the following arguments follow, for example, Serialization. (& Amp; a)

Debugged cdecl function calls generated by both BCC32 and CLANG:

  • BCC32 : function arguments are passed in the correct order on the stack, then when the argument address is required, the stack address is set directly.

  • CLANG : function arguments are passed in the correct order on the stack, however, in the actual function, a copy of all the arguments is executed in memory in the reverse order of the stack, and when the address of the function argument is needed, the memory address is set directly, which leads to the return of the order.

Otherwise, do not assume how the function arguments are located in memory from the C / C ++ function code. Its compiler dependent.

In my case, a possible solution is to declare the RPC functions using the pascal (Win32) calling convention, which forces the MIDL compiler to parse the arguments themselves. Unfortunately, the generated MIDL code is heavy and bad code, requiring a lot of configuration to compile, not yet completed)

0
source

Source: https://habr.com/ru/post/989889/


All Articles