Swprintf chokes characters outside the 8-bit range

This happens on OS X, although I suspect this applies to any UNIX-y OS. I have two lines that look like this:

const wchar_t * test1 = (const wchar_t *) "\ x44 \ x00 \ x00 \ x00 \ x73 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00";
const wchar_t * test2 = (const wchar_t *) "\ x44 \ x00 \ x00 \ x00 \ x19 \ x20 \ x00 \ x00 \ x73 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00 \ x00";

In the debugger, test1 looks like "Ds", and test2 looks like "D" (with a curly apostrophe). Then I call this code:

wchar_t buf1 [100], buf2 [100];
int ret1 = swprintf (buf1, 100, L "% ls", test1);
int ret2 = swprintf (buf2, 100, L "% ls", test2);

The first call to swprintf is working fine. The second returns -1 (and the buffer does not change).

I guess the problem has something to do with locales, but googling around didn't provide me anything useful. This is the easiest way to reproduce the problem that I see. I'm really interested in vswprintf (), but I assume it is closely related.

Why is swprintf choking a Unicode character that is outside the 8-bit range? Is there any way around this?

+3
source share
1 answer

Try explicitly setting the locale to UTF-8.

setlocale(LC_CTYPE, "UTF-8");
...
const wchar_t* test2 = L"D\x2019s";
int ret2 = swprintf(buf2, 100, L"%ls", test2);
...
+5
source

Source: https://habr.com/ru/post/1751034/


All Articles