GCC 4.8, 5.1, 6.2 and Clang 3.8.1 on Ubuntu 16.10 with -std=c11
, -std=c++11
, -std=c++14
and -std=c++17
show this strange behavior when using fgetws(buf, (int) bufsize, stdin)
after setlocale(LC_ALL, "any_THING.utf8");
.
Program Example:
#include <locale.h> #include <wchar.h> #include <stdlib.h> #include <stdio.h> int main(const int argc, const char* const * const argv) { (void) argc; setlocale(LC_ALL, argv[1]); const size_t len = 3; wchar_t *buf = (wchar_t *) malloc(sizeof (wchar_t) * len), *stat = fgetws(buf, (int) len, stdin); wprintf(L"[%ls], [%ls]\n", stat, buf); free(buf); return EXIT_SUCCESS; }
Casting malloc
is only for C ++ - compat.
Compile it as follows: cc -std=c11 fg.c -o fg
.
Run it with argv[1] = "C"
and echo 10 bytes in STDIN under Valgrind, and we will find ...
$ python3 -c 'print("5" * 10)' | \ valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all ./f C ==1775== Memcheck, a memory error detector ==1775== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==1775== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info ==1775== Command: ./f C ==1775== [55], [55] ==1775== ==1775== HEAP SUMMARY: ==1775== in use at exit: 0 bytes in 0 blocks ==1775== total heap usage: 5 allocs, 5 frees, 25,612 bytes allocated ==1775== ==1775== All heap blocks were freed -- no leaks are possible ==1775== ==1775== For counts of detected and suppressed errors, rerun with: -v ==1775== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
The program works fine, and there are no errors in the memory.
If it runs from the UTF-8 locale as argv[1]
, we get the correct output, but the memory error is at 0x18
and the fatal segmentation error.
$ python3 -c 'print("5" * 10)' | \ valgrind --leak-check=full --track-origins=yes --show-leak-kinds=all ./f en_US.utf8 ==1934== Memcheck, a memory error detector ==1934== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==1934== Using Valgrind-3.12.0.SVN and LibVEX; rerun with -h for copyright info ==1934== Command: ./f en_US.utf8 ==1934== [55], [55] ==1934== Invalid read of size 8 ==1934== at 0x4EAF575: _IO_wfile_sync (wfileops.c:534) ==1934== by 0x4EB6DB1: _IO_default_setbuf (genops.c:523) ==1934== by 0x4EB2FC8: _IO_file_setbuf@ @GLIBC_2.2.5 (fileops.c:459) ==1934== by 0x4EB79B5: _IO_unbuffer_all (genops.c:921) ==1934== by 0x4EB79B5: _IO_cleanup (genops.c:966) ==1934== by 0x4E73282: __run_exit_handlers (exit.c:96) ==1934== by 0x4E73339: exit (exit.c:105) ==1934== by 0x4E593F7: (below main) (libc-start.c:325) ==1934== Address 0x18 is not stack'd, malloc'd or (recently) free'd ==1934== ==1934== ==1934== Process terminating with default action of signal 11 (SIGSEGV) ==1934== Access not within mapped region at address 0x18 ==1934== at 0x4EAF575: _IO_wfile_sync (wfileops.c:534) ==1934== by 0x4EB6DB1: _IO_default_setbuf (genops.c:523) ==1934== by 0x4EB2FC8: _IO_file_setbuf@ @GLIBC_2.2.5 (fileops.c:459) ==1934== by 0x4EB79B5: _IO_unbuffer_all (genops.c:921) ==1934== by 0x4EB79B5: _IO_cleanup (genops.c:966) ==1934== by 0x4E73282: __run_exit_handlers (exit.c:96) ==1934== by 0x4E73339: exit (exit.c:105) ==1934== by 0x4E593F7: (below main) (libc-start.c:325) ==1934== If you believe this happened as a result of a stack ==1934== overflow in your program main thread (unlikely but ==1934== possible), you can try to increase the size of the ==1934== main thread stack using the --main-stacksize= flag. ==1934== The main thread stack size used in this run was 8388608. ==1934== ==1934== Process terminating with default action of signal 11 (SIGSEGV) ==1934== Access not within mapped region at address 0x18 ==1934== at 0x4EAF575: _IO_wfile_sync (wfileops.c:534) ==1934== by 0x4EB6DB1: _IO_default_setbuf (genops.c:523) ==1934== by 0x4EB2FC8: _IO_file_setbuf@ @GLIBC_2.2.5 (fileops.c:459) ==1934== by 0x4EB79B5: _IO_unbuffer_all (genops.c:921) ==1934== by 0x4EB79B5: _IO_cleanup (genops.c:966) ==1934== by 0x4FAA93B: __libc_freeres (in /lib/x86_64-linux-gnu/libc-2.24.so) ==1934== by 0x4A276EC: _vgnU_freeres (vg_preloaded.c:77) ==1934== by 0x1101: ??? ==1934== by 0x3805234F: ??? (mc_malloc_wrappers.c:483) ==1934== by 0x51FA8BF: ??? (in /lib/x86_64-linux-gnu/libc-2.24.so) ==1934== If you believe this happened as a result of a stack ==1934== overflow in your program main thread (unlikely but ==1934== possible), you can try to increase the size of the ==1934== main thread stack using the --main-stacksize= flag. ==1934== The main thread stack size used in this run was 8388608. ==1934== ==1934== HEAP SUMMARY: ==1934== in use at exit: 35,007 bytes in 149 blocks ==1934== total heap usage: 233 allocs, 84 frees, 46,936 bytes allocated ==1934== ==1934== 11 bytes in 1 blocks are still reachable in loss record 1 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6396B: new_composite_name (setlocale.c:167) ==1934== by 0x4E63F91: setlocale (setlocale.c:378) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 32 bytes in 1 blocks are still reachable in loss record 2 of 24 ==1934== at 0x4C2EB55: calloc (vg_replace_malloc.c:711) ==1934== by 0x4EF288B: __wcsmbs_load_conv (wcsmbsload.c:168) ==1934== by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75) ==1934== by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223) ==1934== by 0x4EAFC58: _IO_fwide (iofwide.c:124) ==1934== by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58) ==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53) ==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 42 bytes in 1 blocks are still reachable in loss record 3 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166) ==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 50 bytes in 1 blocks are still reachable in loss record 4 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166) ==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 56 bytes in 1 blocks are still reachable in loss record 5 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241) ==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 92 bytes in 2 blocks are still reachable in loss record 6 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166) ==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 104 bytes in 1 blocks are still reachable in loss record 7 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241) ==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 132 bytes in 12 blocks are still reachable in loss record 8 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4EC5C49: strndup (strndup.c:43) ==1934== by 0x4E64AB4: _nl_find_locale (findlocale.c:315) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 132 bytes in 12 blocks are still reachable in loss record 9 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4EC5BF9: strdup (strdup.c:42) ==1934== by 0x4E63BCE: setlocale (setlocale.c:369) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 144 bytes in 2 blocks are still reachable in loss record 10 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241) ==1934== by 0x4E6BE94: _nl_make_l10nflist (l10nflist.c:295) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 208 bytes in 1 blocks are still reachable in loss record 11 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E631C9: __gconv_lookup_cache (gconv_cache.c:372) ==1934== by 0x4E5B34B: __gconv_find_transform (gconv_db.c:752) ==1934== by 0x4EF296A: __wcsmbs_getfct (wcsmbsload.c:91) ==1934== by 0x4EF296A: __wcsmbs_load_conv (wcsmbsload.c:186) ==1934== by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75) ==1934== by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223) ==1934== by 0x4EAFC58: _IO_fwide (iofwide.c:124) ==1934== by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58) ==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53) ==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 208 bytes in 1 blocks are still reachable in loss record 12 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E630EB: __gconv_lookup_cache (gconv_cache.c:372) ==1934== by 0x4E5B34B: __gconv_find_transform (gconv_db.c:752) ==1934== by 0x4EF2A0D: __wcsmbs_getfct (wcsmbsload.c:91) ==1934== by 0x4EF2A0D: __wcsmbs_load_conv (wcsmbsload.c:189) ==1934== by 0x4EF2B83: get_gconv_fcts (wcsmbsload.h:75) ==1934== by 0x4EF2B83: __wcsmbs_clone_conv (wcsmbsload.c:223) ==1934== by 0x4EAFC58: _IO_fwide (iofwide.c:124) ==1934== by 0x4EAB1A4: _IO_getwline_info (iogetwline.c:58) ==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53) ==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 365 bytes in 12 blocks are still reachable in loss record 13 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 461 bytes in 12 blocks are still reachable in loss record 14 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 672 bytes in 12 blocks are still reachable in loss record 15 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 826 bytes in 24 blocks are still reachable in loss record 16 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BAE0: _nl_make_l10nflist (l10nflist.c:166) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 1,024 bytes in 1 blocks are still reachable in loss record 17 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4EA7381: _IO_file_doallocate (filedoalloc.c:101) ==1934== by 0x4EA890C: _IO_wfile_doallocate (wfiledoalloc.c:70) ==1934== by 0x4EAD159: _IO_wdoallocbuf (wgenops.c:390) ==1934== by 0x4EAF39C: _IO_wfile_overflow (wfileops.c:441) ==1934== by 0x4EACA12: __woverflow (wgenops.c:226) ==1934== by 0x4EACA12: _IO_wdefault_xsputn (wgenops.c:331) ==1934== by 0x4EAF7A0: _IO_wfile_xsputn (wfileops.c:1033) ==1934== by 0x4E925EB: vfwprintf (vfprintf.c:1320) ==1934== by 0x4EABA98: wprintf (wprintf.c:32) ==1934== by 0x10885D: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 1,248 bytes in 12 blocks are still reachable in loss record 18 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 1,600 bytes in 1 blocks are still reachable in loss record 19 of 24 ==1934== at 0x4C2CA6F: malloc (vg_replace_malloc.c:298) ==1934== by 0x4C2EDEF: realloc (vg_replace_malloc.c:785) ==1934== by 0x4E6B692: extend_alias_table (localealias.c:397) ==1934== by 0x4E6B692: read_alias_file (localealias.c:319) ==1934== by 0x4E6B8B0: _nl_expand_alias (localealias.c:203) ==1934== by 0x4E648D7: _nl_find_locale (findlocale.c:161) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 1,728 bytes in 24 blocks are still reachable in loss record 20 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E6BC70: _nl_make_l10nflist (l10nflist.c:241) ==1934== by 0x4E6BDC6: _nl_make_l10nflist (l10nflist.c:285) ==1934== by 0x4E64A05: _nl_find_locale (findlocale.c:218) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 2,048 bytes in 1 blocks are still reachable in loss record 21 of 24 ==1934== at 0x4C2ED5F: realloc (vg_replace_malloc.c:785) ==1934== by 0x4E6B61C: read_alias_file (localealias.c:331) ==1934== by 0x4E6B8B0: _nl_expand_alias (localealias.c:203) ==1934== by 0x4E648D7: _nl_find_locale (findlocale.c:161) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 3,344 bytes in 12 blocks are still reachable in loss record 22 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4E64F09: _nl_intern_locale_data (loadlocale.c:95) ==1934== by 0x4E64F09: _nl_load_locale (loadlocale.c:266) ==1934== by 0x4E649B9: _nl_find_locale (findlocale.c:234) ==1934== by 0x4E63B7B: setlocale (setlocale.c:340) ==1934== by 0x108806: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 4,096 bytes in 1 blocks are still reachable in loss record 23 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4EA7381: _IO_file_doallocate (filedoalloc.c:101) ==1934== by 0x4EA890C: _IO_wfile_doallocate (wfiledoalloc.c:70) ==1934== by 0x4EB6875: _IO_doallocbuf (genops.c:398) ==1934== by 0x4EAE493: _IO_wfile_underflow (wfileops.c:197) ==1934== by 0x4EAC431: _IO_wdefault_uflow (wgenops.c:213) ==1934== by 0x4EAB0E5: _IO_getwline_info (iogetwline.c:65) ==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53) ==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== 16,384 bytes in 1 blocks are still reachable in loss record 24 of 24 ==1934== at 0x4C2CB3F: malloc (vg_replace_malloc.c:299) ==1934== by 0x4EA88D8: _IO_wfile_doallocate (wfiledoalloc.c:79) ==1934== by 0x4EB6875: _IO_doallocbuf (genops.c:398) ==1934== by 0x4EAE493: _IO_wfile_underflow (wfileops.c:197) ==1934== by 0x4EAC431: _IO_wdefault_uflow (wgenops.c:213) ==1934== by 0x4EAB0E5: _IO_getwline_info (iogetwline.c:65) ==1934== by 0x4EAAC4A: fgetws (iofgetws.c:53) ==1934== by 0x10883D: main (in /home/cat/projects/c/misc/fgetws/f) ==1934== ==1934== LEAK SUMMARY: ==1934== definitely lost: 0 bytes in 0 blocks ==1934== indirectly lost: 0 bytes in 0 blocks ==1934== possibly lost: 0 bytes in 0 blocks ==1934== still reachable: 35,007 bytes in 149 blocks ==1934== suppressed: 0 bytes in 0 blocks ==1934== ==1934== For counts of detected and suppressed errors, rerun with: -v ==1934== ERROR SUMMARY: 2 errors from 1 contexts (suppressed: 0 from 0)
My question comes down to: is this a bug in libc6
or libstdc++6
? Or fgetws
after installing the UTF-8 language standard show some undefined behavior (according to glibc docs or the C standard), or is my code somehow wrong?
Please note that according to the stack trace of Valgrind, it seems that this may be a bug in Valgrind, but the segfaults program when not running under Valgrind or when starting with AddressSanitizer ( libasan
).