UTF-8 vs ASCII text

Question

UTF-8 vs ASCII text

Why does sql database use UTF-8 encoding? Do they both use 8-bit to store a character?

+4

user15432 May 04 '10 at 14:42

3 answers

For "normal" characters, only 8 bits are used. For characters that are not suitable for 8 bits, more bits can be used. This makes UTF-8 a variable-length encoding.

Wikipedia has a good article on UTF-8.

ASCII defines only 128 characters. So just 7 bits. But usually it is stored with 8 bits / character. RS232 (old serial communication) can be used with 7-bit bytes.

+1

Gvs May 04 '10 at 14:45

source share

ASCII can only represent a limited number of characters at a time. It is not very useful to represent any language that is not based on the Latin character set. However, UTF-8, which is the coding standard for UCS-4 (Unicode), can represent almost any language. He does this by combining several bytes together to represent one character (or a glyph to be more correct).

0

Torlack May 04 '10 at 14:46

source share

segfault · Accepted Answer · 2010-05-04T14:53:53+0000

UTF-8 is used to support a wide range of characters. In UTF-8, up to 4 bytes can be used to represent a single character.

Joel wrote an article on this subject that you may want to refer to

Absolute Minimum Every software developer should absolutely, positively need to know about Unicode and character sets (no excuses!)

UTF-8 vs ASCII text

More articles: