If the html is well-formed, you can use the regular SAX parser to parse the html.
Html, unfortunately, is often not very well formed. In this case, you can first parse the html on the server using tag-soup. If this is not possible, you can try using jtidy on the device.
How to parse (not correctly formed) HTML code in android?
source share