I have a problem with classc ASP / VBScript trying to read a UTF-8 encoded XML file with MSXML. The file is encoded correctly, I see this with all the other tools.
Built XML example:
<?xml version="1.0" encoding="UTF-8"?>
<itshop>
<Product Name="Backup gewünscht" />
</itshop>
If I try to do this in ASP ...
Set fso = Server.CreateObject("Scripting.FileSystemObject")
Set ts = fso.OpenTextFile("input.xml", FOR_READING)
XML = ts.ReadAll
ts.Close
Set ts = nothing
Set fso = Nothing
Set myXML = Server.CreateObject("Msxml2.DOMDocument.4.0")
myXML.loadXML(XML)
Set DocElement = myXML.documentElement
Set ProductNodes = DocElement.selectNodes("//Product")
Response.Write ProductNodes(0).getAttribute("Name")
' ...
... and the name contains special characters (specific German umlauts), the bytes of the "double-byte code" umlauts are transcoded, so I get two absolutely crappy meaningless characters. What should be "ü" becomes "¼" - this is four bytes on my output, not two (correct UTF-8) or one (ISO-8859 - #).
What am I doing wrong? Why does MSXML think that the input is ISO-8859- # so that it tries to convert it to UTF-8?