Just using the SAX parser will not create a representation of your XML tree in memory (which is why SAX is more memory efficient). This will only trigger "events" whenever a new XML element is encountered. You will need to keep the context (often the stack of parent elements) in memory to โknowโ where you are in the tree.
Since you will not have a tree in memory, you cannot use XPath. You can only check the current "context" (your manuallay managed stack) to request your document. Remember that the SAX parser will only execute one run in your file, so the order in the file is important.
Fortunately, there is another approach, for example, VTD-XML , which is a library that creates an XML tree in memory, but only part of the structure, it does not extract the actual content from the file, the content is extracted as necessary. This is much more memory efficient than the DOM parser, but XPath does. I personally use this library at work to parse ~ 700 MB of XML files with XPath (yes, it's insane, but it works, and it's very fast.)
source share