Trying to read widescreen char gives EOF

I have a text file foo.txtwith this content:

R⁸2

I had a great program that read it and did things with each character, but always got EOF when it got on . Here are the relevant parts of the code:

setlocale(LC_ALL,"");

FILE *in = fopen(argv[1],"r");

while (1) {
    wint_t c = getwc(in);
    printf("%d ",wctob(c));

    if (c == -1)
        printf("Error %d: %s\n",errno,strerror(errno));

    if (c == WEOF)
        return 0;
}

It prints 82 -1(ASCII codes for Rand EOF). No matter where I have it ¹in the file, it is always read as EOF. Change , I added a check to errno, and it gives the following:

Error 84: Invalid or incomplete multibyte or wide character

However, ⁸ is Unicode U + 2078 'SUPERSCRIPT EIGHT' . I wrote it foo.txtthrough catand copied using fileformat.info file. Hexagon foo.txtshows:

0000000: 52e2 81b8 32                             R...2

What is the problem?

+4
1

1. WEOF EOF

EOF . WEOF - . getwc EOF.

stdio.h:

#define EOF (-1)

wchar.h:

#define WEOF (0xffffffffu)

2. Unicode

C - C, POSIX, ASCII. setlocale, , Unicode. C.UTF-8 .

setlocale(LC_ALL,"C.UTF-8");
setlocale(LC_CTYPE,"C.UTF-8");

3.

getwc char, int wchar_t, wint_t. , C wint_t, .

0

Source: https://habr.com/ru/post/1683550/


All Articles