Wchar_t encoding on different platforms

I had a problem with encodings on different platforms (in my case, Windows and Linux). On windows, wchar_t is 2 bytes, while on Linux it is 4 bytes. How can I "standardize" wchar_t the same size for both platforms? Is it difficult to implement without additional libraries? So far I am targeting the printf / wprintf API. Data is sent through a socket connection. Thanks.

+4
source share
2 answers

If you want to send Unicode data across platforms and architectures, I would suggest using UTF-8 encoding and (8-bit) char s. UTF-8 has some advantages, such as the absence of problems with endiannes (UTF-8 is just a simple sequence of bytes, instead UTF-16 and UTF-32 can be low-north or big-endian ...).

On Windows, just convert the text of UTF-8 to UTF-16 on the border of the Win32 API (since the Windows APIs work with UTF-16). You can use the MultiByteToWideChar() API to do this.

+3
source

To solve this problem, I think you will need to convert all the lines to UTF-8 before passing. On Windows, you should use the WideCharToMultiByte function to convert wchar_t strings to UTF-8 strings and MultiByteToWideChar to convert UTF-8 strings to wchar_t strings.

On Linux, things are not so simple. You can use the wctomb and mbtowc functions , however, what they convert to / from depends on the basic locale setting. Therefore, if you want them to be converted to / from UTF-8 and Unicode, you need to make sure that the locale is set to use UTF-8 encoding.

This article may also be a good resource.

0
source

Source: https://habr.com/ru/post/1494103/


All Articles