I am trying to parse some xml which is in the following format:
<label>
<name></name>
<sometag></sometag>
<sublabels>
<label></label>
<label></label>
</sublabel>
</label>
Parsing this
for event, element in etree.iterparse(gzip.GzipFile(f), events=('end', ), tag='label'):
if event == 'end':
name = element.xpath('name/text()')
creates an empty variable name because
<sublabels>
<label></label>
<label></label>
</sublabel>
Question:
Is there a way to set the depth of iterparse or ignore the subclass label other than checking if it is empty?
source
share