For a large file, you want to use the SAX parser , not the DOM parser.
Using the DOM analyzer, it will be read into the entire file and loaded into a tree of objects in memory. Using the SAX parser, it will sequentially read the file and call the user-defined callback functions to process the data (start tags, end tags, CDATA, etc.).
Using the SAX parser, you will need to maintain your state (for example, which tag you are currently using), which makes it a little more complicated, but for a large file it will be much more memory efficient.
Eric Petroelje Jul 22 '09 at 17:58 2009-07-22 17:58
source share