C ++ ifstream UTF8 first characters

  • Why does a file saved as UTF8 (in Notepad ++) have this symbol at the beginning of fstream that I opened to it in my C ++ program?

    '╗┐

    I have no idea what it is, I just know that it is not there when I save ASCII. UPDATE: if I save it in UTF8 (without specification), it will not be there.

  • How to check file encoding (ASCII or UTF8, everything else will be rejected;)) in C ++. Are these these characters exactly?

Thank!

+3
source share
5 answers

UTF-16, . . , . Unicode (U + FEFF), (BOM). UTF-16, . UTF-16, , . , , , , .

UTF-8, . , Windows, , UTF-8. BOM UTF-8, , 0xEF 0xBB 0xBF. OEM ( Windows).

, UTF-8, - . , Windows 1252. , UTF-8, .

, ASCII UTF-8 , .

, UTF-8, . , .

:. U+FEFF ZERO WIDTH NO BREAK U+2060 WORD JOINER, [Gillam, Richard, Unicode Demystified, Addison-Wesley, 2003, p. 108]. . UTF-8 0xEF 0xBB 0xBF , , UTF-8. , . U + FEFF, U + 2060 . , U + FEFF , .

+7

, UTF8 , [...] , , , , ASCII.

, (BOM) U+FEFF, . (notepad ++ 5.4.3) , UTF-8, EF BB BF . , UTF-8.

. , . Unicde , , , .

+1

, ASCII UTF-8, ASCII. , UTF-8, UTF-8, .

+1

, , . , , , UTF-8 EF BB BF.

, , , . ( , ). , , Joel Spolsky , Unicode ( !)

0

, (.. ), , , , , (BOM) (sort of), UTF-8. / , .

, , . UTF-8. UTF-8 , UTF-8 , , , .

0

Source: https://habr.com/ru/post/1756440/


All Articles