How to parse (not correctly formed) HTML in Android?

How to parse invalid HTML in android?

I tried to use XOM and TagSoup, but when creating Builder I get the following error:

11-26 20:42:39.294: ERROR/dalvikvm(1298): Could not find method org.apache.xerces.impl.Version.getVersion, referenced from method nu.xom.Builder. 

Should I install Xerces to use XOM or can I use tagoup without XOM?

+1
source share
2 answers

You can find JTidy ( http://jtidy.sourceforge.net/ ) - the HTMLTidy port should be fairly lightweight. It outputs XHTML on request

+2
source

XOM may require that Xerces be in the classpath - this may depend on the version of Java. We are currently using

 xercesImpl-2.8.0.jar 
0
source

Source: https://habr.com/ru/post/1337539/


All Articles