SAX vs XmlTextReader - SAX in C #

I am trying to read a large XML document and I wanted to do this in pieces vs XmlDocument in order to read the entire file in memory. I know I can use XmlTextReader for this, but I was wondering if someone was using SAX for .NET? I know that Java developers swear by this, and I was wondering if I should try, and if so, what are the advantages of using it. I am looking for specifics.

+12
java c # xml sax
Sep 24 '08 at 15:26
source share
4 answers

If you are talking about SAX for .NET , the project does not seem to be supported. The latest issue was over 2 years ago. Maybe they did a great job with the latest release, but I wouldn't argue about that. The author, Karl Waklawek, seems to have disappeared from the net.

Regarding SAX under Java? You bet it's great. Unfortunately, SAX was never developed as a standard, so all ports other than Java adapted the Java API for their needs. Although the DOM is a rather nasty API, it has the advantage of being developed for several languages ​​and environments, which makes it easy to implement in Java, C #, JavaScript, C, etc.

+7
Sep 24 '08 at 15:58
source share

If you just want to quickly complete the task, there is an XmlTextReader (in .NET) for this purpose.

If you want to learn the de facto standard (and available in other programming languages), which will be stable and which will make you code very efficiently and elegantly, but which is also extremely flexible, then go to SAX. However, do not waste your time if you are not going to create esoteric XML parsers. Instead, find the parsers that are the next generation parsers (like XmlTextReader) for your specific platform.

SAX Resources
SAX was originally written for Java, and you can find an original open source project that has been stable for several years: http://sax.sourceforge.net/

There is a C # port of the same project (with HTML documents as part of the source code download); it is also stable: http://saxdotnet.sourceforge.net/

If you do not like the C # implementation, you can always resort to the link to the COM DLLs through COMInterop using MSXML3 or later: http://msdn.microsoft.com/en-us/library/ms994343.aspx

Articles that come from the Java world, but that probably illustrate the concepts you need to succeed with this approach (there may also be Java source code that can be useful and simple enough to convert to C #):

This will be a cumbersome implementation. I used SAX only in my pre-.NET days, but this requires some pretty advanced coding methods. At the moment, it’s just not worth the trouble.

Interesting hybrid analyzer concept
This thread describes a hybrid parser that uses the .NET XmlTextReader to implement a parser that provides a combination of the advantages of DOM and SAX ...
http://bytes.com/groups/net-xml/178403-xmltextreader-versus-dom

+9
Feb 13 '09 at 15:29
source share

I believe that there are no benefits to using SAX for at least two reasons:

  • SAX is a push model, while XmlReader is a traction analyzer that offers several advantages .
  • Depends on the third-party library, and not on the standard .NET API.
+6
Sep 24 '08 at 23:16
source share

Personally, I prefer the SAX model, since XmlReader has some really annoying traps that can cause errors in your code, which can cause your code to skip elements. Most of the code will be structured in about time (rdr.Read ()), but if you have any "ReadString" or "ReadInnerXml ()" in this loop, you will find that you skip the items in the next iteration.

Since SAX is an event, it will never evolve, since you cannot perform any operations that would make your parser look forward.

My personal opinion is that Microsoft came up with the idea that XmlReader better explains the push / pull model, but I really don't buy it. Therefore, Microsoft believes that you do not need to create a state machine with XmlReader, which for me does not make sense, but in any case, this is just my opinion.

+5
Aug 13 '09 at 10:31
source share



All Articles