I would venture to suggest that the last time we bit Unicode guys: C ++ originally made wchar_t so that it could contain every Unicode character. This required holding at least 16 bits, because Unicode had to use no more than 16 bits. Shortly after the popular implementation decided to use the 16-bit type wchar_t , it was discovered that 16 bits were actually not enough. The last time I watched, Unicode used 20 bits, but why play too short again? It is unlikely that the use of 24-bit types is widespread, and if you need to use a specific code point, this is most similar to using only 16 bits, i.e. You can use \uNNNN .
The description in paragraph 2 [lex.charset] in 2.3 means that universal character names refer to code points. At the same time, the name of the universal symbol is used to indicate the short name of the symbol. I am not an expert for Unicode, but I think it means that code points are needed for this.
source share