What type of data is C11 an array according to AMD64 ABI

Question

What type of data is C11 an array according to AMD64 ABI

I studied the x86_64 calling convention used on OSX and read the "Aggregates and Unions" section in the ABI System V x86-64 standard ). He mentions arrays, and I decided that it looks like an array with a fixed length c, for example. int[5] .

I went to "3.2.3" Passing Parameters "to read about how arrays were passed, and if I understand correctly, something like uint8_t[3] should be passed in registers, since it is less than the four limit eight bytes imposed by rule 1 of the classification of aggregate types (p. 18 near the bottom).

After compilation, I see that instead it is passed as a pointer. (I am compiling with clang-703.0.31 from Xcode 7.3.1 on OSX 10.11.6).

An example of the source that I used to compile is as follows:

 #include <stdio.h> #define type char extern void doit(const type[3]); extern void doitt(const type[5]); extern void doittt(const type[16]); extern void doitttt(const type[32]); extern void doittttt(const type[40]); int main(int argc, const char *argv[]) { const char a[3] = { 1, 2, 3 }; const char b[5] = { 1, 2, 3, 4, 5 }; const char c[16] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 1, 1, 1, 1 }; const char d[32] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 1, 1, 1, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 1, 1, 1, 1 }; const char e[40] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 1, 1, 1, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 }; doit(a); doitt(b); doittt(c); doitttt(d); doittttt(e); }

I dump this in a file called ac and use the following command to compile: clang -c ac -o ao . I use otool to analyze the assembled assembly (by running otool -tV ao ) and get the following output:

 ao: (__TEXT,__text) section _main: 0000000000000000 pushq %rbp 0000000000000001 movq %rsp, %rbp 0000000000000004 subq $0x10, %rsp 0000000000000008 leaq _main.a(%rip), %rax 000000000000000f movl %edi, -0x4(%rbp) 0000000000000012 movq %rsi, -0x10(%rbp) 0000000000000016 movq %rax, %rdi 0000000000000019 callq _doit 000000000000001e leaq _main.b(%rip), %rdi 0000000000000025 callq _doitt 000000000000002a leaq _main.c(%rip), %rdi 0000000000000031 callq _doittt 0000000000000036 leaq _main.d(%rip), %rdi 000000000000003d callq _doitttt 0000000000000042 leaq _main.e(%rip), %rdi 0000000000000049 callq _doittttt 000000000000004e xorl %eax, %eax 0000000000000050 addq $0x10, %rsp 0000000000000054 popq %rbp 0000000000000055 retq

Or, what’s the same, here is the Godbolt compiler explorer with clang3.7 , which targets Linux, which uses the same ABI.

So, I was wondering if anyone could lead me to what data types in C11 refer to arrays. (It seems that clang uses C11 by default - see the section here right below the built-in C99 function).

I also did a similar study with ARM and found similar results, although the ARM standard also indicates that there is an aggregate of the Type array .

Also, is there somewhere in some standard that it indicated that a fixed-length array should be treated as a pointer?

+2

c assembly types x86-64 calling-convention

Danzimm Aug 6 '16 at 2:33

source share

1 answer

Peter Cordes · Accepted Answer · 2016-08-06T03:12:38+0000

Base arrays as functions of args in C and C ++ always break into pointers, as in other contexts.

Arrays inside a struct or union not executed and passed by value. This is why ABIs need to take care of how they are passed, although this does not happen in C for bare arrays.

As Keith Thomson points out , the relevant part of standard C N1570 section 6.7.6.3 clause 7

Declaring a parameter as a "type array" should be adjusted to a "qualified type pointer", where type qualifiers (if any) are those specified in the [and] array type output ... (for material on foo[static 10] , see below)

Note that multidimensional arrays work like arrays of type array, so only the outer level of "array-ness" is converted to a pointer to the type of array.

Terminology: The x86-64 ABI document uses the same terms as ARM, where struct and arrays are “aggregates” (multiple elements at consecutive addresses). Thus, the phrase “aggregates and unions” arises a lot because union processed in a similar way to the language and ABI.

This is a recursive rule for handling composite types (struct / union / class), which results in game rules that pass an array to ABI. This is the only way to see asm that copies the array onto the stack as part of the arg function, for C or C ++

 struct s { int a[8]; }; void ext(struct s byval); void foo() { struct s tmp = {{0}}; ext(tmp); }

gcc6.1 compiles it (for AMD64 SysV ABI with -O3 ) into the following:

  sub rsp, 40 # align the stack and leave room for `tmp` even though it never stored? push 0 push 0 push 0 push 0 call ext add rsp, 72 ret

In ABI x86-64, the pass-by-value occurs by actually copying (to registers or the stack), and not using hidden pointers.

Note that return-by-value passes the pointer as the “hidden” first arg (in rdi ) when the return value is too large to match the 128-bit concatenation of rdx:rax (and there is no vector returned in vector regs, etc. d.).

It would be possible for the ABI to use a hidden pointer to objects with a value greater than a certain size and trust the function to be called not to change the original, but this is not what the x86-64 ABI chooses, It would be better in some cases (especially for inefficient C ++ with a lot of copying without changes (i.e., wasted)), but worse in other cases.

Read SysV ABI Bonus . clang / gcc sign / zero extend narrow arguments up to 32 bits .

Note that in order to truly ensure that the arg function is a fixed-size array, C99 and later allows you to use the static in a new way : array sizes. (It is still passed as a pointer, this does not change the ABI).

 void bar(int arr[static 10]);

This allows sizeof(arr) to work as you would expect inside the called function, and allow compiler warnings about going out of bounds. It also potentially provides better optimization if the compiler knows that it allows access to elements that are not in source C. (See this blog post ).

The same keyword page for C ++ states that the C ++ ISO does not support this use of static ; this is one of those C-only functions, along with C99 variable-length arrays and several other goodies that C ++ does not have.

In C ++, you can use std::array<int,10> to get the compilation time size information passed to the caller. However, you need to manually pass it by reference if this is what you want, since it is, of course, just a class containing int arr[10] . Unlike a C-style array, it does not decompose into T* automatically.

The ARM document that you linked does not seem to call arrays the aggregate type: Section 4.3, “Composite Types” (which discusses alignment) distinguishes arrays from aggregate types, although they appear to be a special case of its definition for aggregates.

A Composite Type is a collection of one or more basic data types that are processed as a unit at the procedure call level. The composite type can be any:
An assembly in which elements are sequentially placed in memory
An association where each member has the same address
An array that is a repeated sequence of another type (its base type).
Definitions are recursive; that is, each of the types may contain a composite type as a member

“Composite” is an umbrella term that includes arrays, structures, and associations.

What type of data is C11 an array according to AMD64 ABI

More articles: