How to get the content of an XML element using XmlSerializer?
I have an XML reader in this XML line:
<?xml version="1.0" encoding="UTF-8" ?>
<story id="1224488641nL21535800" date="20 Oct 2008" time="07:44">
<title>PRESS DIGEST - PORTUGAL - Oct 20</title>
<text>
<p> LISBON, Oct 20 (Reuters) - Following are some of the main
stories in Portuguese newspapers on Monday. Reuters has not
verified these stories and does not vouch for their accuracy. </p>
<p>More HTML stuff here</p>
</text>
</story>
I created an XSD and the corresponding class for deserialization.
[System.Xml.Serialization.XmlRootAttribute(Namespace="", IsNullable=false)]
public class story {
[System.Xml.Serialization.XmlAttributeAttribute()]
public string id;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string date;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string time;
public string title;
public string text;
}
Then I instantiate the class using the DeserializeXmlSerializer method .
XmlSerializer ser = new XmlSerializer(typeof(story));
return (story)ser.Deserialize(xr);
Now the member is text storyalways null. How to change the class storyso that XML is parsed as expected?
EDIT:
Using XmlText does not work, and I do not control the XML analysis.
I found a very unsatisfactory solution.
Change the class as follows (ugh!)
// ...
[XmlElement("HACK - this should never match anything")]
public string text;
// ...
And change the calling code as follows (yuck!)
XmlSerializer ser = new XmlSerializer(typeof(story));
string text = string.Empty;
ser.UnknownElement += delegate(object sender, XmlElementEventArgs e) {
if (e.Element.Name != "text")
throw new XmlException(
string.Format(CultureInfo.InvariantCulture,
"Unknown element '{0}' cannot be deserialized.",
e.Element.Name));
text += e.Element.InnerXml;
};
story result = (story)ser.Deserialize(xr);
result.text = text;
return result;
This is a very bad way to do this because it destroys encapsulation. Is there a better way to do this?
, , - p-, : .
. XmlArray ( , XmlArrayItemAttribute), , :
<text>
<p>blah</p>
<p>blib</p>
</text>
, , .
- :
public class Text //Obviously a bad name for a class...
{
public string[] p;
public string[] pre;
}
XmlArray, , , , , .
Edit:
:
[System.Xml.Serialization.XmlRootAttribute(Namespace = "", IsNullable = false)]
public class story
{
[System.Xml.Serialization.XmlAttributeAttribute()]
public string id;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string date;
[System.Xml.Serialization.XmlAttributeAttribute()]
public string time;
public string title;
[XmlArrayItem("p")]
public string[] text;
}
XML, . :
<text>
<p>
<p>qwertyuiop</p>
<p>asdfghjkl</p>
</p>
<pre>
<pre>stuff</pre>
<pre>nonsense</pre>
</pre>
</text>
, , .
I ran into the same problem after using XSD.exe to generate XSD from XML and then XSD for classes. I added the [XmlText] tag in front of the object class in the generated class file (in my case, it is called P by virtue of the tag <p>that it displayed as an XML node), and it worked instantly. pulling out the full HTML content that was inside the parent node and putting P into that object, which then renamed it something more useful.