What does this assembly language code mean?

I am a student and just started to learn assembly language. To better understand this, I simply wrote a short word in C and translated it into assembly language. Surprisingly, I did not understand a bit.

Code:

#include<stdio.h> int main() { int n; n=4; printf("%d",n); return 0; } 

And the corresponding assembly language:

 .file "delta.c" .section .rodata .LC0: .string "%d" .text .globl main .type main, @function main: .LFB0: .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset 5, -8 movl %esp, %ebp .cfi_def_cfa_register 5 andl $-16, %esp subl $32, %esp movl $4, 28(%esp) movl $.LC0, %eax movl 28(%esp), %edx movl %edx, 4(%esp) movl %eax, (%esp) call printf movl $0, %eax leave .cfi_restore 5 .cfi_def_cfa 4, 4 ret .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3" .section .note.GNU-stack,"",@progbits 

What does it mean?

+4
source share
2 answers

Let me break it:

 .file "delta.c" 

The compiler uses this to tell you the source file from which the assembly was assembled. This means little to assembler.

 .section .rodata 

This is starting a new section. "Rodata" is the name of the read-only section. This section completes writing data to an executable file that receives memory that is mapped as read-only data. All the ".rodata" pages of the executable image are ultimately shared by all the processes that load the image.

As a rule, any "compile-time constants" in the source code that cannot be optimized as part of the built-in assemblies will be stored in the "Read only data" section.

 .LC0: .string "%d" 

The .LC0" is a label. This proves a symbolic name that refers to the bytes that occur after it in the file. In this case," LC0 "represents the string"% d ". GNU assembler uses the convention that labels starting with β€œL” are considered β€œlocal labels.” This has technical meaning, which is mostly interesting to people who write compilers and linkers. In this case, it is used by the compiler to indicate a character that is private to a particular object file. In this case, it represents a string con Tanta.

 .text 

This is starting a new section. The text section is the section in the object files in which executable code is stored.

 .globl main 

The ".global" directive tells the assembler to add the label that follows it to the list of shortcuts "exported" by the generated object file. This basically means "this is the character that should be visible to the linker." For example, a "non-static" function in "C" can be called by any c file that declares (or includes) a prototype of a compatible function. This is why you can #include stdio.h and then call printf . When any non-static C function is compiled, the compiler creates an assembly that declares a global label at the beginning of the function. Contrast this with things that shouldn't be related, such as string literals. The assembler code in the object file still needs a shortcut to reference literal data. These are "local" characters.

 .type main, @function 

I do not know exactly how GAS (gnu assembler) handles the ".type" directives. However, this tells the assembler that the label "main" refers to the executable code, not the data.

 main: 

This defines the entry point for your "core" function.

 .LFB0: 

This is the "local label" that refers to the start of the function.

  .cfi_startproc 

This is the "Frame Information" directive. It instructs the assembler to emit information for debugging the dwarf format.

  pushl %ebp 

This is the standard part of the prolog function in assembly code. It saves the current value of the ebp register. The "ebp" or "base" register is used to store the "base" of the stack frame inside the function. While the esp register ("stack pointer") can change as functions are called inside the function, "ebp" remains fixed. Any function arguments can always be obtained relative to "ebp". Under ABI conventions, before functon can change the EBP register, it must save it so that the original value can be restored before the function returns.

  .cfi_def_cfa_offset 8 .cfi_offset 5, -8 

I did not examine them in detail, but I believe that they are related to DWARF debugging information.

  movl %esp, %ebp 

GAS uses the AT & T syntax, which is in the reverse order from that used by Intel. This means that "set ebp is esp". This basically sets the "base pointer" for the rest of the function.

  .cfi_def_cfa_register 5 andl $-16, %esp subl $32, %esp 

It is also part of the epilugue for function. This aligns the stack pointer and then subtracts enough space from it to hold all the locales for the function.

  movl $4, 28(%esp) 

It loads a 32-bit integer constant 4 into the slot in the stack frame.

  movl $.LC0, %eax 

This loads the string constant% d defined above in eax.

  movl 28(%esp), %edx 

This loads the value "4" stored at offset 28 on the stack into edx. Most likely, your code has been compiled with optimizations disabled.

  movl %edx, 4(%esp) 

Then it pushes the value 4 onto the stack, in the place that should be when calling printf.

  movl %eax, (%esp) 

This loads the string "% d" into the stack space that should be when calling printf.

  call printf 

This calls printf.

  movl $0, %eax 

This sets eax to 0. Given that the following instructions are "leave" and "ret", this is equivalent to "returning 0" in the C code. The EAX register is used to store the return value of a function.

  leave 

This command clears the call frame. It returns the ESP back to EBP, then pops the EBP from the modified stack pointer. Like the following instruction, this is part of the function epilogue.

  .cfi_restore 5 .cfi_def_cfa 4, 4 

These are more DWARF stuff.

  ret 

This is the actual return instruction. It returns from function

  .cfi_endproc .LFE0: .size main, .-main .ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3" .section .note.GNU-stack,"",@progbits 
+27
source

For me, intels syntax is easier to read; learning how to generate intels syntax is useful for a better understanding of C programs;

 gcc -S -masm=intel file.c 

On Windows, your C program will be;

  .file "file.c" .intel_syntax noprefix .def ___main; .scl 2; .type 32; .endef .section .rdata,"dr" LC0: .ascii "%d\0" .text .globl _main .def _main; .scl 2; .type 32; .endef _main: LFB13: .cfi_startproc push ebp .cfi_def_cfa_offset 8 .cfi_offset 5, -8 mov ebp, esp .cfi_def_cfa_register 5 and esp, -16 sub esp, 32 call ___main mov DWORD PTR [esp+28], 4 mov eax, DWORD PTR [esp+28] mov DWORD PTR [esp+4], eax mov DWORD PTR [esp], OFFSET FLAT:LC0 call _printf mov eax, 0 leave .cfi_restore 5 .cfi_def_cfa 4, 4 ret .cfi_endproc LFE13: .ident "GCC: (rev2, Built by MinGW-builds project) 4.8.1" .def _printf; .scl 2; .type 32; .endef 

(compiler options should be the same on ubuntu, like on Windows)

Besides the psychotic shortcuts, this is more like the assembly I read about in text books.

Here is a way to look at it:

  call ___main mov DWORD PTR [esp+28], 4 mov eax, DWORD PTR [esp+28] ; int n = 4; mov DWORD PTR [esp+4], eax mov DWORD PTR [esp], OFFSET FLAT:LC0 call _printf ; printf("%d",n); mov eax, 0 leave ; return 0; 
+2
source

Source: https://habr.com/ru/post/1492760/


All Articles