gcc simply takes a defensive approach with -m32 , without assuming that main is invoked with a properly aligned stack in 16B.
The i386 V ABI system has guaranteed / required for many years that the ESP + 4 aligns to 16B when entering the function. (that is, the ESP must be aligned 16B before the CALL instruction, therefore it argues the beginning of the stack at the border of 16 B. This is the same as for the x86-64 V system).
ABI also ensures that new 32-bit processes start with an ESP aligned on the 16B boundary (for example, at _start , at the ELF entry point, where the ESP points to argc rather than the return address) and the CRT glibc code supports alignment.
Regarding the call agreement, EBP is another call-saving register. But yes, the compiler output using -fno-omit-frame-pointer will take care of push ebp in front of other registers to be saved codes (for example, EBX) and will do this even if functions do not need to use EBP, therefore stored EBP values ββform a linked list.
Perhaps gcc is protected because the extremely ancient Linux kernel (from this version to i386 ABI, when the required alignment was only 4B) could violate this assumption, and these are just additional pair instructions that run once in a lifetime, time (at provided that the program does not call main recursively).
Unlike gcc, clang assumes the stack is correctly aligned when entering main. (clang also assumes that the narrow arguments were marked with a sign or equal to zero up to 32 bits , although the current ABI revision does not indicate this behavior (for now). gcc and clang both emit code that does on the caller side, but only clang depends on it in the called. This happens in 64-bit code, but I did not check 32-bit.)
Look at the compiler output at http://gcc.godbolt.org/ for the core and other functions besides the core, if you're interested.
I just updated the ABI links in the x86 tag wiki the other day. http://x86-64.org/ is still dead and doesn't seem to be returning, so I updated System V links to point to PDF files of the current version in the HJ Lu github repo and its page with links .
Please note that the latest version on the SCO website is not the current version and does not include the requirement to align the 16B stack.