The example data you provided indicates one problem, while the question and exception you provided suggest another. Do you have several XML documents combined together, each with its own XML declaration, or do you have an XML fragment with several top-level elements?
If this is the first, then the solution will include splitting the input stream into several streams and parsing each of them individually. This does not necessarily mean, as one comment suggests, implement an XML parser. You can search for a string for XML declarations without having to parse anything else in it if your input does not include CDATA sections containing unscreened XML declarations. You can write a file-like object that returns characters from the base stream until it gets into the XML declaration, and then wrap it in a generator function that stores the returned streams until EOF is reached. This is not trivial, but it is not too difficult.
If you have an XML fragment with several top-level elements, you can simply wrap them in an XML element and parse all of this.
Of course, as with most problems with incorrect XML input, the easiest solution is to fix what creates the bad input.
source share