Note. I am defining an implementation-defined behavior that is in Microsoft Visual C ++ 2008 (possibly the same in 2005+). OS: simplified Chinese installation of Win7.
This surprises me when I do I / O without ASCII w / printf . For instance.
// This won't be necessary as it the system default code page. //system("chcp 936"); // NULL to show current locale, which is "C" printf ("%s\n", setlocale(LC_ALL, NULL)); printf ("中\n"); printf ("%s\n", setlocale(LC_ALL, "English")); printf ("中\n");
Output:
Active code page: 936 C中English_United States.1252 ?D
The fingerprint in the debugger shows that "中" encoded in two bytes: 0xD6 , 0xD0 , which is the code point of this character on code page 936, for simplified Chinese. It should not be in the range of the code area "C" locale , which is most likely 0x0 ~ 0x7F .
Question:
Why can it correctly display a character in the "C" locale? So, I assumed that the language is not related to printf ? But then I ask why it can not be displayed anymore when changing to "English" locale, which also differs from 936? Interesting?
Edit:
I redirected standard output to a file and did some tests. It shows that no matter what locale is set, the correct "中" character is saved in the file. This suggests that setlocale() is related to the way the console displays the character, which contradicts my understanding of how it works: printf puts bytes / codes into the input buffer of the console, which interprets these bytes using its own code page (which returns chcp ) .
source share