Could you try to execute this code (instead of using
block) and paste the result again? I assume youre on .NET 4.
using (var responseStream = response.GetResponseStream()) using (var memoryStream = new MemoryStream()) { responseStream.CopyTo(memoryStream); byte[] bytes = memoryStream.ToArray(); content = BitConverter.ToString(bytes); }
Change I noticed that you did not insert the entire returned string into your posts. Is it because the rest of the line contains sensitive data? If yes, do not insert the result suggested above.
Change 2 . To get the correct result, you can use Encoding.GetEncoding(1252)
; however, I would suggest that you do not, for reasons that I will explain shortly.
The explanation . From what I understood, the problem is that the sending side does not correctly encode its encodings. You say their documentation claims to be UTF-8, which is clearly contrary to their XML declaration ISO-8859-1. In fact, the encoding used is not one of two.
In the hexadecimal string that you downloaded, the culprit character has a byte value of 0x96
and occurs in the middle of the sequence 20-96-20
. In both UTF-8 and ISO-8859-1 (as well as ASCII in front of them), 0x20
is a space character. However, in UTF-8 , 0x96
is a continuation byte and is invalid , except for the previous start byte (which 0x20
not). In ISO-8859-1 , 0x96
is the control character C1 and therefore is not a printable character (cannot be displayed to users).
Thus, we can conclude that the source character encoding is neither UTF-8 nor ISO-8859-1, but Windows-1252 , sometimes considered a โsupersetโ of ISO-8859-1, as it replaces the control character range 0x80
- 0x9F
displayed characters. In fact, on Windows-1252, 0x96
is the en-dash symbol you were expecting.
Given the above, it can be safe to solve your problem by assuming the encoding is Windows-1252; however, if I were you, I would like to contact the provider and inform them of this shortcoming.
using (var stream = response.GetResponseStream()) using (var reader = new StreamReader(stream, System.Text.Encoding.GetEncoding(1252))) content = reader.ReadToEnd();