XmlReader ReadSubtree () abuse

I need to parse an XML file, which is an image of a really large tree structure, so I use the XmlReader class to populate the tree on the fly. Each node is passed only a piece of xml, which it expects from its parent through the ReadSubtree () function. This has the advantage that you do not have to worry about when the node consumes all of its children. But now I’m wondering if this is really a good idea, because there can be thousands of nodes and when reading the .NET source files, I find that with each call to ReadSubtree a couple (and probably more) new objects are created and caching for objects of multiple use (what i saw).

Maybe ReadSubtree () was not considered widely used, or maybe I'm just worried about nothing, and I just need to call GC.Collect () after parsing the file ...

Hope someone can shed some light on this.

Thanks in advance.

Update:

Thanks for the nice and insightful answers.

I had a deeper look at the .NET source code, and I found it more complex than I imagined. I finally abandoned the idea of ​​calling this feature in this scenario. As Stefan noted, an xml reader is never passed on to outsiders, and I can trust the code that parses the xml stream (which is written by myself), so I would rather make each node take responsibility for the amount of data that they steal from the stream than using the function notSo-thin-in-the-end ReadSubtree () to just save a few lines of code.

+4
source share
2 answers

ReadSubTree () provides an XmlReader that wraps the original XmlReader. This new reader is presented to consumers as a complete document. This can be important if the code you pass in the subtree thinks that it receives a separate XML document. For example, the Depth property of the new Reader starts at 0. This is a pretty thin shell, so you will not use more resources than if you used the original XmlReader directly. In the example you pointed out, this is most likely you really don't get much from reading the subtree.

A big advantage in your case would be that the subtree reader cannot accidentally read behind the subtree. Since the subtree reader is not very expensive, such security may be sufficient, although this is usually useful when you need the subtree to look like a document, or you do not trust the code to read only its own subtree.

As will be noted, you never want to call GC.Collect (). This will never improve performance.

+10
source

Having made the assumption that all objects are created in a regular managed heap, and not a bunch of large objects (i.e., less than 85 thousand), there really should not be any problems, this is exactly what GC was developed for.

I would suggest that at the end of the process, you also do not need to call GC.Collect, as in all cases that allow the GC to plan its own collections, it allows it to work optimally (see this blog post for a very detailed explanation of the GC, which explains this a lot better than I can).

+2
source

Source: https://habr.com/ru/post/1277014/


All Articles