I am trying to convert a double-byte character sequence (DBCS) in CP936 to wchar_t using C ++ language. This is the code:
#include <iostream> #include <locale> #include <codecvt> // 国 in CP936 char const src[] = "\xB9\xFA"; int main() { std::locale loc(".936"); typedef std::codecvt<wchar_t, char, std::mbstate_t> codecvt_type; codecvt_type const & cvt = std::use_facet<codecvt_type>(loc); std::mbstate_t state; std::memset(&state, 0, sizeof(state)); char const * src_mid = src; wchar_t buf[10]; wchar_t * buf_mid = buf; std::codecvt_base::result res = cvt.in(state, src, src + 2, src_mid, buf, buf + 10, buf_mid); int eno = errno; std::cout << "res: " << +res << "\n" << "errno: " << eno << "\n"; return 0; }
Now the conversion always fails, and errno set to 42, which is equal to EILSEQ . I debugged the code, and I think I see what is going wrong, but I do not understand why.
What is wrong is that the code that ultimately leads to the call to MultiByteToWideChar() has the following condition:
if ( ploc->_Isleadbyte[ch >> 3] & (1 << (ch & 7)) )
This branch is never taken, even though the source AFAIK line contains the correct high byte and final byte. I checked the _Isleadbyte array in the debugger and these are all zeros. Thus, this branch, which sets the input length to 2 , is never taken, but instead, where the length is set to 1 , it is taken and, therefore, MultiByteToWideChar() fails, because the leading byte must be followed by a trailing byte.
I even checked that C_936.NLS present in C:\Windows\System32\ , so this should not be a problem.
So, I think the question is, is this problem at my end, with test code, with installing Windows, lack of components? Or is this a problem in Visual Studio 2015 code?
UPDATE
So I accidentally stumbled upon this question: Shift-JIS compression failed using wifstrem in Visual C ++ 2013
OPs own answer shows a workaround:
const int oldMbcp = _getmbcp(); _setmbcp(932); const std::locale locale("Japanese_Japan.932"); _setmbcp(oldMbcp);
The same workaround works for CP936, which I am trying to use.
UPDATE 2
I sent a bug report to Microsoft.