What exactly does __STDC_ISO_10646__ mean?

I find it hard to understand the __STDC_ISO_10646__ macro from my copy of the C ++ standard:

__STDC_ISO_10646__

The integer constant of the form yyyymmL (e.g. 199712L). If this character is defined, then each character in Unicode needs to be set, when it is stored in an object of type wchar_t, it has the same meaning as the short identifier of this character. The required Unicode set consists of all the characters defined in ISO / IEC 10646, along with all amendments and technical corrections, as indicated by the year and month.

In my opinion, this means that wchar_t on your system will be a Unicode code point. It's right? If so, then the encoding utf-8 and utf-16 will not match, and utf-32 will match on the right ?. Also, what other character encodings match?

+4
source share
2 answers

The section of standard code that you specify (Β§16.8. Predefined macro names [cpp.predefined]) prefixes of a series of definitions:

ΒΆ2 The following macros are conditionally determined by the implementation:

This means that if the implementation cannot satisfy the requirements (for example, since wchar_t is a 16-bit type), the implementation will not define __STDC_ISO_10646__ .

On the other hand, if wchar_t is a 32-bit or larger type, then the implementation may be able to define a macro. ISO 10646 requires only 21 bits to represent all characters, but for almost all practical purposes, this means that the 16-bit wchar_t too small and the 32-bit wchar_t is large enough. It also means that implementing from scratch can turn wchar_t into a 32-bit type. Pre-existing implementations can be skewed by backward compatibility if they selected 16-bit wchar_t before this option was standardized.

+4
source

A macro refers to the value of a unicode character when it is stored in wchar_t.

More specifically, the ISO / IEC 10646 standard supports more characters, as changes are made to the standard.

The year and month that you can define as the value for the macro means that when you save the Unicode character of the wchar_t variable, the value of the Unicode character that will be stored in this variable will be the same as it was in the current year and month.

See here [http://www.unicode.org/charts/][1] for a reference to Unicode short identifiers

Hope this helps

Lefteris

+1
source

Source: https://habr.com/ru/post/1436226/


All Articles