What is the difference between Unicode codes and Unicode scanning?

Question

What is the difference between Unicode codes and Unicode scanning?

I see that these two are used (apparently) interchangeably in many cases - are they the same or different? It also depends on whether the language speaks of UTF-8 (like Rust) or UTF-16 (like Java / Haskell). Is the definition of the code point / scalar somehow dependent on the encoding scheme?

+4

unicode

theindigamer Jan 26 '18 at 16:15

source share

1 answer

theindigamer · Answer 1 · 2018-01-26T16:15:16+0000

First, consider the definitions of D9, D10, and D10a, Section 3.4, Symbols and Encoding :

D9 Unicode Code Space : Range of integers from 0 to 10FFFF_sixteen.
D10 Code Point : Any value in the Unicode code code.
• .
...
D10a : < > : , , , , , , .
[ ]

, . , " ".

D76, 3.9, Unicode:

D76 Unicode: , .
• Unicode 0 D7FF ₁₆ E000 ₁₆ 10FFFF ₁₆, .

3.8 D76. , : . UTF-16, . ( 1114112 , 2 ¹⁶= 65536 .) UTF-8 ; ( 1-4 ), .

: , . " ", . UTF-16 , . UTF-8 .

Unicode glossary. , Unicode.

What is the difference between Unicode codes and Unicode scanning?

More articles: