Mingw-w64: slow sprintf in <cstdio>
This <cstdio> header in C ++ contains only the same functions as <stdio.h> , but fits in the std ?
I had strange performance problems in my program compiled with mingw-w64, which is more than ten times slower than on linux. After some test, I found that the problem is in sprintf .
Then I performed the following test:
#include <stdio.h> // #include <cstdio> // using std::sprintf; int main () { int i; for (i = 0; i < 500000; i++){ char x[100]; sprintf(x, "x%dx%dx", i, i<<2); } } When compiling with <stdio.h> it is 15 times faster than when using <cstdio> . Here is the time:
$ time ./stdio real 0m0.557s user 0m0.046s sys 0m0.046s $ time ./cstdio real 0m7.465s user 0m0.031s sys 0m0.077s $ g++ --version g++.exe (rubenvb-4.8-stdthread) 4.8.1 20130324 (prerelease) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. UPDATE 1 : I also confined to different mingw-w64 builds (rubenvb, drangon and mingw-build) and found that the entire 32-bit version using <cstdio> timed 4.x seconds and 64-bit versions 7.x ~ 8 .x seconds. And all versions using <stdio.h> timed for approximately 0.4 ~ 0.6 seconds.
UPDATE 2 : I parsed the main function in gdb and found only one line: the version <stdio.h> calls callq 0x4077c0 <sprintf> , but the version <cstdio> calls callq 0x407990 <_Z7sprintfPcPKcz> .
sprintf contains:
0x00000000004077c0 <+0>: jmpq *0x7c6e(%rip) # 0x40f434 <__imp_sprintf> 0x00000000004077c6 <+6>: nop 0x00000000004077c7 <+7>: nop Following __imp_sprintf , I reached sprinf inside msvcrt.dll .
_Z7sprintfPcPKcz contains some text codes:
0x0000000000407990 <+0>: push %rbp 0x0000000000407991 <+1>: push %rbx 0x0000000000407992 <+2>: sub $0x38,%rsp 0x0000000000407996 <+6>: lea 0x80(%rsp),%rbp 0x000000000040799e <+14>: mov %rcx,-0x30(%rbp) 0x00000000004079a2 <+18>: mov %r8,-0x20(%rbp) 0x00000000004079a6 <+22>: mov %r9,-0x18(%rbp) 0x00000000004079aa <+26>: mov %rdx,-0x28(%rbp) 0x00000000004079ae <+30>: lea -0x20(%rbp),%rax 0x00000000004079b2 <+34>: mov %rax,-0x58(%rbp) 0x00000000004079b6 <+38>: mov -0x58(%rbp),%rdx 0x00000000004079ba <+42>: mov -0x28(%rbp),%rax 0x00000000004079be <+46>: mov %rdx,%r8 0x00000000004079c1 <+49>: mov %rax,%rdx 0x00000000004079c4 <+52>: mov -0x30(%rbp),%rcx 0x00000000004079c8 <+56>: callq 0x402c40 <__mingw_vsprintf> 0x00000000004079cd <+61>: mov %eax,%ebx 0x00000000004079cf <+63>: mov %ebx,%eax 0x00000000004079d1 <+65>: add $0x38,%rsp 0x00000000004079d5 <+69>: pop %rbx 0x00000000004079d6 <+70>: pop %rbp Why cstdio use a different (and much slower) function?
libstdC ++ defines __USE_MINGW_ANSI_STDIO at build time ( config/os/mingw32-w64/os_defines.h ), which will include the mingw sprintf wrapper. As @Michael Burr noted, these wrappers exist for compatibility with C99 / GNU99.
Your test does not define __USE_MINGW_ANSI_STDIO , so you will not get a shell with stdio.h . But since it was defined when building libstdc ++, you get it using cstdio . If you, however, define it yourself by including stdio.h , you will again get a shell.
So you really get different implementations, and cstdio std::sprintf does not necessarily match stdio.h sprintf , at least not when it comes to mingw.
Here is a test. First source:
#ifdef USE_STDIO #include <stdio.h> #else #include <cstdio> using std::sprintf; #endif int main () { int i; for (i = 0; i < 500000; i++){ char x[100]; sprintf(x, "x%dx%dx", i, i<<2); } } Results:
$ g++ -o test_cstdio.exe test.cc $ g++ -o test_stdio.exe -DUSE_STDIO test.cc $ g++ -o test_stdio_wrap.exe -DUSE_STDIO -D__USE_MINGW_ANSI_STDIO test.cc $ for x in test_*.exe; do ( echo $x; objdump -d $x | grep sprintf; echo ); done test_cstdio.exe 40154a: e8 41 64 00 00 callq 407990 <_Z7sprintfPcPKcz> 0000000000402c40 <__mingw_vsprintf>: 0000000000407990 <_Z7sprintfPcPKcz>: 4079c8: e8 73 b2 ff ff callq 402c40 <__mingw_vsprintf> test_stdio.exe 40154a: e8 71 62 00 00 callq 4077c0 <sprintf> 00000000004077c0 <sprintf>: 4077c0: ff 25 6e 6c 00 00 jmpq *0x6c6e(%rip) # 40e434 <__imp_sprintf> test_stdio_wrap.exe 40154a: e8 41 64 00 00 callq 407990 <_Z7sprintfPcPKcz> 0000000000402c40 <__mingw_vsprintf>: 0000000000407990 <_Z7sprintfPcPKcz>: 4079c8: e8 73 b2 ff ff callq 402c40 <__mingw_vsprintf>