What is the string encoding from StreamReader.ReadLine ()

First, let's see the code:

//The encoding of utf8.txt is UTF-8 StreamReader reader = new StreamReader(@"C:\\utf8.txt", Encoding.UTF8, true); while (reader.Peek() > 0) { //What is the encoding of lineFromTxtFile? string lineFromTxtFile = reader.ReadLine(); } 

As Joel said in his famous article :

If you have a string in memory, in a file, or in an email message, you need to know what encoding it is in, or you cannot interpret or display it correctly. "

So here is my question: what is the encoding of the line lineFromTxtFile string? UTF-8 (because it is from a text file encoded in UTF-8)? or UTF-16 (since the string in .NET is "Unicode" (UTF-16))?

Thanks.

+6
source share
3 answers

. NET strings are unicode. Coding does not matter, then until you need to use it further. For example, if you write a file to a file, then you specify the output encoding. But since .NET processes everything you do with a string through library calls, it doesn't matter how it is presented in memory.

+2
source

All. .net string variables are encoded using Encoding.Unicode (UTF-16, small end). Even better, because you know that your text file is utf-8, and told your stream reader the correct encoding in the constructor, any special characters will be processed correctly.

+5
source

This will be Unicode, because all the lines are .NET. The real question is: why is this important?

+1
source

Source: https://habr.com/ru/post/901269/


All Articles