You are talking about Unicode. Unicode uses 32 bits to represent a character. However, since this will lose memory, there are more compact encodings. UTF-8 is one such encoding. It is assumed that you use byte units and maps Unicode characters to 1, 2, 3, or 4 bytes. UTF-16 is another one that uses words as units and maps Unicode characters to 1 or 2 words (2 or 4 bytes). You can use both encodings with both string and wchar_t. UTF-8 tends to be more compact for English text / numbers.
Some things will work regardless of the encoding used and type (cf.). However, all functions that must understand one character will be violated. I. The fifth character is not always the fifth line in the base array. It may seem like it works with certain examples, but it will eventually break. string :: compare will work, but don't expect to get an alphabetical order. It depends on the language. string :: find_first_of will work for some, but not for all. Long strings are likely to work just because they are long, and shorter strings can be confused by the alignment of characters and it is very difficult to find errors.
Itβs best to find a library that processes it for you, and ignore the type below it (unless you have good reason to choose one or the other).
source share