C # spaces problem with XmlReader

I have a simple xml

<data> <node1>value1</node1> <node2>value2</node2> </data> 

I am using IXmlSerializable to read and write such xml with DTO. The following code works just fine

 XmlReader reader; ... while( reader.Read() ){ Console.Write( reader.ReadElementContentAsString() ); } // outputs value1value2 

However, if spaces are removed in xml, i.e.

 <data> <node1>value1</node1><node2>value2</node2> </data> 

or I use XmlReaderSettings.IgnoreWhitespace = true; , the code only outputs โ€œvalue1โ€, ignoring the second node. When I print the nodes that the parser goes through, I see that ReadElementContentAsString moves the pointer to EndElement from node2 , but I donโ€™t understand why this should happen or how to fix it.

Is this a possible XML parser implementation error?

=================================================

Here is sample code and 2 xml samples that give different results

 string homedir = Path.GetDirectoryName(Application.ExecutablePath); string xml = Path.Combine( homedir, "settings.xml" ); FileStream stream = new FileStream( xml, FileMode.Open ); XmlReaderSettings readerSettings = new XmlReaderSettings(); readerSettings.IgnoreWhitespace = false; XmlReader reader = XmlTextReader.Create( stream, readerSettings ); while( reader.Read() ){ if ( reader.MoveToContent() == XmlNodeType.Element && reader.Name != "data" ){ System.Diagnostics.Trace.WriteLine( reader.NodeType + " " + reader.Name + " " + reader.ReadElementContentAsString() ); } } stream.Close(); 

1.) settings.xml

 <?xml version="1.0"?> <data> <node-1>value1</node-1> <node-2>value2</node-2> </data> 

2.) settings.xml

 <?xml version="1.0"?> <data> <node-1>value1</node-1><node-2>value2</node-2> </data> 

using (1) prints

 Element node-1 value1 Element node-2 value2 

using (2) prints

 Element node-1 value1 
+4
source share
4 answers

It happens that reader.Read() reads the space character. Ignoring the spaces, the same command reads the second element (the "gnam" XML token), actually bringing a pointer to the node2 element.

Debug reader properties before and after the methods called in your example. Check the NodeType and Value properties. Also give a check for the MoveToContent method, this is very useful.

Read the documentation about all of these methods and properties, and you're done to find out how the XmlReader class works and how you use it for your own purposes. Here is the first google result: it contains a very explicit example.

I have finished the following (not complete) template:

 private static void ReadXmlExt(XmlReader xmlReader, IXmlSerializableExt xmlSerializable, ReadElementDelegate readElementCallback) { bool isEmpty; if (xmlReader == null) throw new ArgumentNullException("xmlReader"); if (readElementCallback == null) throw new ArgumentNullException("readElementCallback"); // Empty element? isEmpty = xmlReader.IsEmptyElement; // Decode attributes if ((xmlReader.HasAttributes == true) && (xmlSerializable != null)) xmlSerializable.ReadAttributes(xmlReader); // Read the root start element xmlReader.ReadStartElement(); // Decode elements if (isEmpty == false) { do { // Read document till next element xmlReader.MoveToContent(); if (xmlReader.NodeType == XmlNodeType.Element) { string elementName = xmlReader.LocalName; // Empty element? isEmpty = xmlReader.IsEmptyElement; // Decode child element readElementCallback(xmlReader); xmlReader.MoveToContent(); // Read the child end element (not empty) if (isEmpty == false) { // Delegate check: it has to reach and end element if (xmlReader.NodeType != XmlNodeType.EndElement) throw new InvalidOperationException(String.Format("not reached the end element")); // Delegate check: the end element shall correspond to the start element before delegate if (xmlReader.LocalName != elementName) throw new InvalidOperationException(String.Format("not reached the relative end element of {0}", elementName)); // Child end element xmlReader.ReadEndElement(); } } else if (xmlReader.NodeType == XmlNodeType.Text) { if (xmlSerializable != null) { // Interface xmlSerializable.ReadText(xmlReader); Debug.Assert(xmlReader.NodeType != XmlNodeType.Text, "IXmlSerializableExt.ReadText shall read the text"); } else xmlReader.Skip(); // Skip text } } while (xmlReader.NodeType != XmlNodeType.EndElement); } } 
+1
source

In the IgnoreWhitespace documentation, the new line is not considered minor.

White space that is not considered significant includes spaces, tabs, and blank lines used to highlight markup for greater readability. An example of this is a space in the content of an element.

XmlReaderSettings.IgnoreWhitespace

+3
source

This is not as great as Luca's answer, but I found the following template useful with reasonable โ€œpredictableโ€ XML (only in spaces and values). Consider:

 string homedir = Path.GetDirectoryName(Application.ExecutablePath); string xml = Path.Combine( homedir, "settings.xml" ); FileStream stream = new FileStream( xml, FileMode.Open ); XmlReaderSettings readerSettings = new XmlReaderSettings(); readerSettings.IgnoreWhitespace = false; XmlReader reader = XmlTextReader.Create( stream, readerSettings ); while( reader.Read() ){ if ( reader.MoveToContent() == XmlNodeType.Element && reader.Name != "data" ){ string name = reader.Name; string value = null; if (!reader.IsEmptyElement) { reader.Read(); // advances reader to element content value = reader.ReadContentAsString(); // advances reader to endelement } reader.Read(); // advance reader to element content System.Diagnostics.Trace.WriteLine( reader.NodeType + " " + name + " " + value ); } } stream.Close(); 

In general, instead of reader.ReadElementContent*() use reader.Read() followed by reader.ReadContent*() .

+1
source

If you want XmlReader not to read spaces, you must initialize the XmlReader with the settings as follows:

 XmlReaderSettings settings = new XmlReaderSettings(); settings.IgnoreWhitespace = true; XmlReader xrd = XmlReader.Create(@"file.xml", settings); 

it works for me in the XML file of the structure you posted:

 <data> <node1>value1</node1> <node2>value2</node2> </data> 
+1
source

Source: https://habr.com/ru/post/1397861/


All Articles