How to create multibyte characters in C

While learning about character encoding in C and C ++, I came across two common encoding methods: multibyte characters and wide characters. To strengthen my understanding of these systems (advantages and disadvantages), I would like to give a few examples. Running wide character examples is not a problem due to the built-in support with the wchar_t type. But when I wanted to create a string containing the so-called multibyte characters, I ran into a problem.

How can I create a multibyte character string that uses an encoding that works with a char array (using Visual C ++)? This type of encoding really exists: http://www.gnu.org/software/libc/manual/html_node/Shift-State.html . But I only read about it and never saw a real example. Or do you need to create your own encoding for this kind of string?

+6
source share
1 answer

If you can create a wide-character string literal, just throwing L should provide you with a multi-byte character string literal with a specific implementation encoding (gcc has the ability to select it, I don't know about visual C ++).

If you have a wide character string, you can get the equivalent multibyte string according to the C locale using the functions wcstombs (in <stdlib.h> ) and wcsrtombs (in <wchar.h> ).

The C ++ language system also provides a way to do this conversion. (Look for the in and out member of the codecvt facet, I will not provide a tutorial on how to use them here, the cppreference site has code examples, for example, for out ).

I'm not sure if you can easily find support on Unix or Windows for shift coding. You should look for encoding for China, Japan, Korea, Vietnam (for example, ISO 2022-JP , but it seems to me that Unix, as a rule, uses EUC-JP and Windows Shift JIS ).

+2
source

Source: https://habr.com/ru/post/974833/


All Articles