Read XML files faster than xsd-generated classes

Question

Read XML files faster than xsd-generated classes

I process many, many XML files containing HL7 Info.

The structure of these XML files is described in several complex XSD files. They are a hierarchy of XSD files. eg:

Messages.xsd
- batch.xsd
- datatypes.xsd
  - Fields.xsd
- MoreFiles.xsd
  - Fields.xsd

This is not an exact use, but it helps to understand how they work.

Now i can run

xsd. \ messages.xsd / classes

and he creates a file called messages.cs whose length exceeds 240,000 lines.

Note. Despite the complexity of XSD, actual xml files average about 250 XML lines with about 25 characters per line (not very large).

I can use this file to deserialize my xml files as follows:

var bytes = Encoding.ASCII.GetBytes(message.Message); var memoryStream = new MemoryStream(bytes); var message = ormSerializer.Deserialize(memoryStream);

Everything works fine and fast.

When it comes time to pull data from the xml structure, it is too slow .

Is there any other way to access my xml data which will be faster? Should I use XPathDocument and XPathNavigator ? Can XPathNavigator use all XSD files, so I don’t need to recreate it for every xml file being processed (not all XML nodes are in all XML files)?

Any other ideas for getting XML data quickly?

+4

performance c # xml .net

Vaccano Dec 02 '11 at 21:49

source share

2 answers

Have you looked at something like XStreamingReader? This allows Linq to be used for XML when streaming over large XML documents. I looked at this in the past and was able to flow through XML, identify XML fragments and deserialize them into objects. If you handle this and need examples, I can dig out the code.

http://xstreamingreader.codeplex.com/

0

Derek beattie Dec 03 '11 at 1:58

source share

Michael kay · Accepted Answer · 2011-12-02T23:33:38+0000

The technology you use (automatically matching XML classes with Java or C #) is called data binding, and it works great when the schema is simple and small. For something as big and ugly as the HL7, I would say it's not a starter.

What kind of treatment do you do? Is there a good reason why you cannot do this in XSLT or XQuery? These languages are designed to process XML, and they avoid the "impedance mismatch" that you get when you need to convert data from an XML model to a data model of a programming language such as Java or C #.

Read XML files faster than xsd-generated classes

More articles: