How can I read an XML file using Python ElementTree if XML has multiple top-level elements?
I have an XML file that I would like to read using Python ElementTree.
Unfortunately, it has several top-level tags. I would wrap it <doc>...</doc>around XML, but I have to put tags <doc> after <?xml> and <!DOCTYPE>. But figuring out where it <!DOCTYPE>ends is nontrivial.
What I have:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE FOO BAR "foo.dtd" [
<!ENTITY ...>
<!ENTITY ...>
<!ENTITY ...>
]>
<ARTICLE> ... </ARTICLE>
<ARTICLE> ... </ARTICLE>
<ARTICLE> ... </ARTICLE>
<ARTICLE> ... </ARTICLE>
What I want:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE FOO BAR "foo.dtd" [
<!ENTITY ...>
<!ENTITY ...>
<!ENTITY ...>
]>
<DOC>
<ARTICLE> ... </ARTICLE>
<ARTICLE> ... </ARTICLE>
<ARTICLE> ... </ARTICLE>
<ARTICLE> ... </ARTICLE>
</DOC>
NB the tag name ARTICLE may change, so I cannot grep for it.
Can someone suggest me how can I add a trailing <doc>...</doc>XML after the header or suggest another workaround?