Android Sax XML parser question using Java: I need to parse the XML files that I get from the Internet and that I do not control them. Some of them contain errors and cause the parser to interrupt with errors, such as an “inappropriate tag” or “malformed (invalid token)”.
These errors do not matter to me, I want to ignore them and keep going, I can process the broken XML structure. But I can not fix the XML files, they are not mine. How can I tell Sax on Android (org.xml.sax.XMLReader class) so as not to throw an exception and continue working? The ErrorHandler attachment did not work, and catching an exception is useless because I cannot resume parsing where it left off.
My XML is not HTML, but here are some (X) HTML examples where browsers ignore errors and keep going. I want to do it too.
- Browsers are fine with " <br> " instead of " <br /> ", even if the tag is never closed.
- " <b> text </b> </i> " works even if the closing tags are in the wrong order.
- " and ", despite an invalid token, " odds & end " will be correct.
I would prefer not to write my own parser, dealing with character set conversions and all that. I do not need to validate XML. Here is my code, reduced to the main one:
XMLReader r = SAXParserFactory.newInstance().newSAXParser().getXMLReader(); r.setErrorHandler(new MyLenientErrorHandlerThatNeverThrows()); r.setContentHandler(new MyImporterThatExtendsDefaultHandler()); r.parse(new InputSource(new BufferedReader(...)));
Thanks!
user1225364
source share