How are global pointer variables stored in memory?

Suppose we have simple code:

int* q = new int(13);

int main() {
    return 0;
}

It is clear that the variable qis global and initialized. From this answer, we expect the variable to qbe stored in the data segment (.data) inside the program file, but this is a pointer, so its value (which is the address in the heap segment) is determined at run time. So, what value is stored in the data segment inside the program file?

My attempt:
In my opinion, the compiler allocates some space for a variable q(usually 8 bytes for a 64-bit address) in the data segment without a significant value. It then places the initialization code in the text segment before the mainfunction code to initialize the variable qat run time. Something like this in the assembly:

     ....
     mov  edi, 4
     call operator new(unsigned long)
     mov  DWORD PTR [rax], 13  // rax: 64 bit address (pointer value)

     // offset : q variable offset in data segment, calculated by compiler
     mov  QWORD PTR [ds+offset], rax // store address in data segment
     ....
main:
     ....

Any idea?

+4
source share
2 answers

Yes, this is essentially how it works.

, ELF .data, .bss .text , . , :

c++ -S -O2 test.cpp

main . ( ++) , main. , .

+3

int *q .bss, .data, ( ++, C). 8 .

, , CRT (C Run-Time) main.

Godbolt init asm . , - RIP- q. RIP, , .text .bss .

Godbolt - . , . gcc6.2 -O3 asm Godbolt , int* q = new int(13);. ( main , ).

# gcc6.2 -O3 output
_GLOBAL__sub_I_q:      # presumably stands for subroutine
    sub     rsp, 8           # align the stack for calling another function
    mov     edi, 4           # 4 bytes
    call    operator new(unsigned long)   # this is the demangled name, like from objdump -dC
    mov     DWORD PTR [rax], 13
    mov     QWORD PTR q[rip], rax      # clang uses the equivalent `[rip + q]`
    add     rsp, 8
    ret

    .globl  q
    .bss
q:
    .zero   8      # reserve 8 bytes in the BSS

ELF ( ).

. ELF x86. ( - DS , [ds:rip+q] - . DS, , .)


main():

    # the "aw" sets options / flags for this section to tell the linker about it.
    .section        .init_array,"aw"
    .align 8
    .quad   _GLOBAL__sub_I_q       # this assembles to the absolute address of the function.

CRT , .init_array call .

.init_array , . , . , CRT- ?


Linux , ELF . printf() glibc stdio _start , asm, , init, ( . Q & A , _start main(), libc).

+2

Source: https://habr.com/ru/post/1664100/


All Articles