C # File.ReadallText does weird things

What I'm trying to do is read all the text in a file, and if it contains the word "Share", do a regular expression. Here is the code:

DirectoryInfo dinfo = new DirectoryInfo(@"C:\Documents and Settings\g\Desktop\123"); FileInfo[] Files = dinfo.GetFiles("*.txt"); foreach (FileInfo filex in Files) { string contents = File.ReadAllText(filex.FullName); string matchingcontants = "Share"; if (contents.Contains(matchingcontants)) { string sharename = Regex.Match(contents, @"\+(\S*)(.)(.*)(.)").Groups[3].Value; File.AppendAllText(@"C:\sharename.txt", sharename + @"\r\n"); } } 

When I debug, I get ... content = "\ r \ 0 \ n \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 = \ 0 \ r \ 0 \ n \ 0+ \ 0S \ 0h \ 0a \ 0r \ 0e \ 0 \ 0 \\ 0 \\ 0j \ 05 \ 02 \ 0 \\ 0w \ 0w \ 0w \ 0_ \ 0O \ 0n \ 0t \

\ 0S \ 0h \ 0a \ 0 \ 0e \

Do not share. Any hints? tips or suggestions?

+3
source share
3 answers

It looks like you have a file that is saved as UTF-16 (i.e. Encoding.Unicode ). Read the file with the correct encoding, and everything should be fine.

Fortunately, there is an overload of File.ReadAllText that accepts the encoding:

 string contents = File.ReadAllText(filex.FullName, Encoding.Unicode); 

Unfortunately, this will do the wrong thing for files that are not in UTF-16. Although there are heuristic ways to guess the encoding, ideally you should know the encoding before opening the file.

+7
source

It looks like this is a Unicode file, and you are trying to read it like regular ASCII.

+2
source

I assume that the encoding is not set correctly, you may need to use ReadAllText (String, Encoding) with the encoding.

+2
source

Source: https://habr.com/ru/post/916664/


All Articles