Is utf-8 null like utf-16 / utf-32 null?

One byte of zeros is null in utf16 and utf32? like in utf8, or do we need 2 and 4 bytes of zeros to create zeros in utf16 and utf32 respectively?

+4
source share
1 answer

In UTF-16, this will be two bytes, and in UTF-32, 4 bytes.

After all, otherwise you could not distinguish between a character whose encoded value has just begun with a zero byte and one zero byte representing U + 0000.

Basically, UTF-16 works in blocks of 2 bytes, and UTF-32 works in blocks of 4 bytes. (Of course, for characters outside BMP, you need two UTF-16 β€œblocks”, but the principle is the same.) If you used the UTF-16 decoder, you would read two bytes at a time.

+10
source

Source: https://habr.com/ru/post/1307584/


All Articles