I need to parse an XML file, which is an image of a really large tree structure, so I use the XmlReader class to populate the tree on the fly. Each node is passed only a piece of xml, which it expects from its parent through the ReadSubtree () function. This has the advantage that you do not have to worry about when the node consumes all of its children. But now Iβm wondering if this is really a good idea, because there can be thousands of nodes and when reading the .NET source files, I find that with each call to ReadSubtree a couple (and probably more) new objects are created and caching for objects of multiple use (what i saw).
Maybe ReadSubtree () was not considered widely used, or maybe I'm just worried about nothing, and I just need to call GC.Collect () after parsing the file ...
Hope someone can shed some light on this.
Thanks in advance.
Update:
Thanks for the nice and insightful answers.
I had a deeper look at the .NET source code, and I found it more complex than I imagined. I finally abandoned the idea of ββcalling this feature in this scenario. As Stefan noted, an xml reader is never passed on to outsiders, and I can trust the code that parses the xml stream (which is written by myself), so I would rather make each node take responsibility for the amount of data that they steal from the stream than using the function notSo-thin-in-the-end ReadSubtree () to just save a few lines of code.
source share