I would like to get the number of characters in a file. By characters, I mean "real" characters, not bytes. Assuming I know the encoding of the file.
I tried to use mbstowcs() , but it does not work because it uses the system locale (or defined using setlocale). Since setlocale is not thread safe, I don't think it's a good idea to use it before calling mbstowcs() . Even if it was safe, I had to be sure that my program would not “jump” (signal, etc.) between calls to setlocale() (one call to set it to the file encoding, and return to previous).
So, to take an example, suppose we have a ru.txt file encoded using Russian encoding (for example, KOI8). So, I would like to open the file and get the number of characters if the encoding of the file is KOI8.
It could be that simple if mbstowcs() can accept the source_encoding argument ...
EDIT: Another problem with using mbstowcs() is that the locale corresponding to the encoding of the file must be set on the system ...
source share