Unicode string exception in XmlElement, despite writing XML in UTF-8

For this XmlElement I need to be able to set the inner text to an escaped version of the Unicode string, even though the document is ultimately encoded in UTF-8. Is there any way to achieve this?

Here is a simple version of the code:

 const string text = "ñ"; var document = new XmlDocument {PreserveWhitespace = true}; var root = document.CreateElement("root"); root.InnerXml = text; document.AppendChild(root); var settings = new XmlWriterSettings {Encoding = Encoding.UTF8, OmitXmlDeclaration = true}; using (var stream = new FileStream("out.xml", FileMode.Create)) using (var writer = XmlWriter.Create(stream, settings)) document.WriteTo(writer); 

Expected:

 <root>&#xF1;</root> 

Actual:

 <root>ñ</root> 

Using the XmlWriter direct and calling WriteRaw(text) works, but I only have access to the XmlDocument , and serialization will happen later. On an XmlElement , InnerText stands out & before &amp; as expected, and setting Value throws an exception.

Is there a way to set the inner XmlElement text to escaped ASCII text, regardless of the encoding used? I feel like I should be missing out on something obvious, or it's just not possible.

+4
source share
1 answer

If you ask XmlWriter to produce ASCII output, it should give you character references for all non-ASCII content.

 var settings = new XmlWriterSettings {Encoding = Encoding.ASCII, OmitXmlDeclaration = true}; 

The output is still valid UTF-8, since ASCII is a subset of UTF-8.

+3
source

Source: https://habr.com/ru/post/1494851/


All Articles