Came to this link exploring your question: http://benedictcohen.co.uk/blog/archives/74
The authors explain an older approach to what @CodaFi suggested, but at the end there is a corresponding update that you should check:
The easiest way to parse HTML is to treat it as XML and use NSXMLParser. iOS comes with LibTidy, which is capable of capturing a lot of markup. Use LibTidy to create pure XML and pass this XML to NSXMLParser. Use only the approach described above if it is not possible to use NSXMLParser.
So maybe option 4 or 5 is for you?
source share