For example, "A ݔ" is saved as "410754"
This is not how UTF-8 works.
Characters U + 0000 through U + 007F (aka ASCII) are stored as single bytes. They are the only characters whose code points numerically match their UTF-8 representation. For example, U + 0041 becomes 0x41, which is 0100001in binary format.
All other characters are represented by several bytes. U + 0080 through U + 07FF use two bytes each, U + 0800 through U + FFFF use three bytes each, and U + 10000 - U + 10FFFF use four bytes each.
, , , UTF-8 , , ASCII, , . 0x00 0x7F ASCII ; 0x7F . , , , .
- . :
- 2 :
110xxxxx 10xxxxxx - 3 :
1110xxxx 10xxxxxx 10xxxxxx - 4 :
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
, . , , 10 . , x.
: U + 0754 U + 0080 U + 07FF, . 0x0754 11101010100, x :
110 11101 10 010100