__cdecl leads to a larger executable file than __stdcall?

I found this:

Since the stack is cleared by the called function, the __stdcall of the calling convention creates smaller executables than __cdecl, in which code to clear the stack must be generated for each function call.

Suppose I have 2 functions:

void __cdecl func1(int x) { //do some stuff using x } void __stdcall func2(int x, int y) { //do some stuff using x, y } 

and here in main() :

 int main() { func1(5); func2(5, 6); } 

IMO, the responsibility of main() is to clear the call stack to func1(5) and func2 will clear the call stack to func2(5,6) , right?

Four questions:

1. To call func1 in main() , it main is responsible for clearing the stack, so the compiler inserts code (code to clear the stack) before and after calling func ? Like this:

 int main() { before_call_to_cdecl_func(); //compiler generated code for stack-clean-up of cdecl-func-call func1(5); after_call_to_cdecl_func(); //compiler generated code for stack-clean-up of cdecl-func-call func2(5, 6); } 

2. To call func2 in main() , it has func2 its own job to clear the stack, so I believe that the code will not be inserted into main() before or after calling func2 , right?

3.Because func2 is __stdcall , so I assume that the compiler automatically inserts the code (to clear the stack) as follows:

 void __stdcall func1(int x, int y) { before_call_to_stdcall_func(); //compiler generated code for stack-clean-up of stdcall-func-call //do some stuff using x, y after_call_to_cdecl_func(); //compiler generated code for stack-clean-up of stdcall-func-call } 

I guess right?

4. Finally, back to the quoted words, why does __stdcall lead to less execution than __cdecl ? And on Linux there is no such thing as __stdcall , right? Does this mean that the linux elf will always be more than exe to win?

+4
source share
4 answers
  • It will insert the code only after the call, that is, the stack pointer in reset, if there are call arguments. *
  • __stdcall does not generate a cleanup code on the call site, however, it should be noted that compilers can receive a stack cleanup from multiple __cdecl calls in one cleanup, or they can delay cleanup to prevent pipeline breakdown.
  • Ignoring the inverted order in this example, no, it will only embed code to clear the __cdecl function, setting the arguments of the function is something else (different compilers generate / prefer different methods).
  • __stdcall had a bigger window, see this . the size of the binary depends on the number of calls to __cdecl functions, more calls mean cleaner code, where, since __stdcall has only one particular instance of the cleanup code. however, you should not see an increase in size, as in most cases you have several bytes per call.

* It is important to distinguish between cleaning and setting call parameters.

+5
source

Historically, the first C ++ compilers used the __stdcall equivalent. In terms of implementation quality, I would expect the C compiler to use the __cdecl __cdecl and the C ++ __stdcall (which were then called Pascal loops). This is one that the early Zortech compilations got it right.

Of course, vararg functions should still use __cdecl . callle cannot clear the stack if it does not know how much to clear.

(Note that the C standard was carefully designed to __stdcall in C. I know of only one compiler that took advantage of this, however; the amount of existing code at that time calling vararg functions without a prototype was huge, and although the standard declared that it is broken, the compiler developers do not want to break the code of their clients.)

In many conditions, there seems to be a very strong tendency to insist that the C and C ++ conventions are the same, that you can take the address of the extern "C++" function and pass it to a function written in C that calls it. IIRC, for example, g ++ does not apply

 extern "C" void f(); 

and

 void f(); 

as having two different types (although the standard requires this), and allows you to pass the address of a static member function to pthread_create , for example. As a result, such compilers use all the same conventions, and on Intel, they are the equivalent of __cdecl .

Many compilers have extensions to support other loops. (Why they don’t use the standard extern "xxx" , I don’t know.) The syntax for however these extensions is very diverse. Microsoft adds an attribute immediately before the function name:

 void __stdcall func( int, int ); 

g ++ puts it in a special attribute clause after the Declaration function:

 void func( int, int ) __attribute__((stdcall)); 

C ++ 11 adds a standard way to specify attributes:

 void [[stdcall]] func( int, int ); 

It does not specify stdcall as an attribute, but it does indicate that additional attributes (other than those defined in the standard) can and are implementation dependent. I expect both g ++ and VC ++ to adopt this syntax in their latest versions, at least if C ++ 11 is activated. The exact name of the attribute ( __stdcall , stdcall , etc.) can change, so you probably want to wrap this with a macro.

Finally: in a modern compiler with optimizations enabled, the difference in calling conventions is probably negligible. Attributes like const (not to be confused with the C ++ const keyword), regparm or noreturn are likely to have a greater impact, both in terms of executable file size and performance.

+3
source

This set of challenging conventions is the story with the new 64-bit ABI .

http://en.wikipedia.org/wiki/X86_calling_conventions#x86-64_calling_conventions

There is also an ABI side for different architectures. (e.g. ARM ) Not everything works the same for all architectures. So don’t worry about this convention!

http://en.wikipedia.org/wiki/Calling_convention

Improving the size of the exe is negligible (may not exist), don't worry ...

__cdecl much more flexible than __stdcall . A variable number of arguments of flexibility, insignificance of the cleaning code (instructions), __cdecl function can be called with the wrong number of arguments, and this does not necessarily cause a serious problem! But the same situation with __stdcall always goes wrong!

+1
source

Others answered other parts of your question, so I will just add my answer about the size:

4. Finally, back to the quoted words, why does __stdcall lead to less execution than __cdecl?

This seems wrong. I tested it by compiling libudis with and without stdcall calling convention. First without:

 $ clang -target i386-pc-win32 -DHAVE_CONFIG_H -Os -I.. -I/usr/include -fPIC -c *.c && strip *.o $ du -cb *.o 6524 decode.o 95932 itab.o 1434 syn-att.o 1706 syn-intel.o 2288 syn.o 1245 udis86.o 109129 totalt 

And with. This is the -mrtd switch that allows stdcall:

 $ clang -target i386-pc-win32 -DHAVE_CONFIG_H -Os -I.. -I/usr/include -fPIC -mrtd -c *.c && strip *.o 7084 decode.o 95932 itab.o 1502 syn-att.o 1778 syn-intel.o 2296 syn.o 1305 udis86.o 109897 totalt 

As you can see, cdecl beats stdcall with several hundred bytes. This may be my testing methodology, which is erroneous, or the clang stdcall code generator is weak. But I think that with modern compilers, the additional flexibility provided by flushing the caller means that they will always generate better code using cdecl rather than stdcall.

0
source

Source: https://habr.com/ru/post/1395179/


All Articles