I am developing a class for a content management system. Input content is provided in XHTML format. And it may contain valid escaped characters, such as £See Example below.
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head xmlns="">
<meta name="Attr_DocumentTitle" content="Hello World Books" />
</head>
<body>
<div>British Pound £</div>
<div>Registered sign ®</div>
<div>Copyright sign © </div>
</body>
</html>
My goal is to write a method that loads this into an XML.Net object, process and save it in a database. I want to keep the screened characters as they are. And here is my method:
public static XmlDocument LoadXmlFromString(string xhtmlContent)
{
byte[] xhtmlByte = Encoding.ASCII.GetBytes(xhtmlContent);
MemoryStream mStream = new MemoryStream(xhtmlByte);
XmlReaderSettings settings = new XmlReaderSettings();
settings.XmlResolver = null;
settings.ProhibitDtd = false;
XmlReader reader = XmlReader.Create(mStream, settings);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(xhtmlContent);
return xmlDoc;
}
This method, however, converts escaped characters to their character equivalents. How can I avoid this and keep escaped characters.
source
share