Reading site encoding problem, three different encodings

I have a problem with WebRequestin C #. This is a google page.

Header states

text/html; charset=ISO-8859-1

Website states

<meta http-equiv=content-type content="text/html; charset=utf-8">

And finally, I get the expected result in the debugger, as well as the regular expression when I use Encoding.Default, which defaults toSystem.Text.SBCSCodePageEncoding

Now what should I do? Do you have any hints how this can happen or how I can solve this problem?

Actual page encoding is similar to UTF-8. At least FF displays it correctly in UTF-8, and not on Windows - regardless, not Latin1.

URL this

The problem is that -sign, as well as all German Umlauts.

Thanks in advance for your help on this issue, which makes me seriously crazy!

:

// create a writer and open the file
TextWriter tw = new StreamWriter("test.txt");

// write a line of text to the file
tw.WriteLine(html);

// close the stream
tw.Close();

.

, , , .

# RegEx UTF-8?

+3
2

, HTML, Google Query API?

BTW, HTML ; -)

EDIT: :

  • API Google Desktop .
  • Google?
  • , , , , , - HTML , -. , - , . , , - , HTML. API, .
+2

... , :

HTML

+1

Source: https://habr.com/ru/post/1789210/


All Articles