Removing deserialization of one element in a large XML document: xmlSerializer.Deserialize (xmlReader.ReadSubtree ()) fails due to namespace problems

I am trying to process a large XML document (using XmlReader ) in one pass and deserialize only certain elements in it using XmlSerializer .

Below is some code and a tiny layout of an XML document showing how I tried to do this.

Justification for using XmlReader : 1. I am dealing with very large XML documents (10 and 250 MB), which for this reason I do do not want to load into memory. So XmlDocument out of the question. 2. I want to extract only certain elements. I can usually ignore most other materials. XmlReader seems to give me an efficient way to skip irrelevant content. 3. I do not know in advance whether there will be any elements that I can handle; therefore, I do not use a bunch of Xpath / XQuery or LINQ to XML queries, because I want to make only one pass through the XML files (due to their size).

 public class ElementOfInterest { } โ€ฆ var xml = @"<?xml version='1.0' encoding='utf-8' ?> <Root xmlns:ex='urn:stakx:example' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'> <ElementOfInterest xsi:type='ex:ElementOfInterest' /> </Root>"; var reader = System.Xml.XmlReader.Create(new System.IO.StringReader(xml)); reader.ReadToFollowing("ElementOfInterest"); var serializer = new System.Xml.Serialization.XmlSerializer(typeof(ElementOfInterest)); serializer.Deserialize(reader.ReadSubtree()); 

The last line of code raises the following internal exception:

InvalidOperationException : "The ex namespace prefix is โ€‹โ€‹not defined."

Obviously, the XmlSerializer does not recognize the ex namespace prefix inside the value of the xsi:type attribute.

This is just one mistake I have, but frankly, the big problem is that I have no idea how to solve the whole namespace problem. I'm just looking for a convenient way to de-serialize only one node from an XML document, but this seems to be due to the need to manually register / manage namespaces and somehow redirect them from XmlReader to XmlSerializer .

Can someone demonstrate how to deserialize a single node from an XML document read using XmlReader , either by indicating an error in my code, or by indicating an alternative approach?

+6
source share
1 answer

The following works:

 using System.IO; using System.Xml; using System.Xml.Serialization; static void Main() { var xml = @"<?xml version='1.0' encoding='utf-8' ?> <Root xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns:ex='urn:stakx:example' > <ex:ElementOfInterest xsi:type='ex:ElementOfInterest' /> </Root>"; var nt = new NameTable(); var mgr = new XmlNamespaceManager(nt); mgr.AddNamespace("ex", "urn:stakx:example"); var ctxt = new XmlParserContext(nt, mgr, "", XmlSpace.Default); var reader = XmlReader.Create(new StringReader(xml), null, ctxt); var serializer = new XmlSerializer(typeof(ElementOfInterest)); reader.ReadToFollowing("ElementOfInterest", "urn:stakx:example"); var eoi = (ElementOfInterest)serializer.Deserialize(reader.ReadSubtree()); } [XmlRoot(Namespace = "urn:stakx:example")] public class ElementOfInterest { } 

Note the input namespace: <ex:ElementOfInterest> .

+5
source

Source: https://habr.com/ru/post/981634/


All Articles