I look at me as if you didn't need any DOM features from your program. I would support using the (c) ElementTree library. If you use the iterparse function of the cElementTree module, you can wade through xml and process events as they occur.
Note, however, Fredriks advice on using the iterparse cElementTree function :
to parse large files, you can get rid of the elements as soon as you process them:
for event, elem in iterparse(source): if elem.tag == "record": ... process record elements ... elem.clear()
The above pattern has one drawback; it does not clear the root element, so you will get one element with many empty child elements. If your files are huge, not just large, this can be a problem. To get around this, you need to get the root element. The easiest way to do this is to enable start events and save the link to the first element in a variable:
# get an iterable context = iterparse(source, events=("start", "end"))
Lxml.iterparse () does not allow this.
The previous one does not work on Python 3.7, consider the following way to get the first element.
# get an iterable context = iterparse(source, events=("start", "end")) is_first = True for event, elem in context:
Steen Nov 28 '08 at 20:03 2008-11-28 20:03
source share