How to parse html content in xml using tagoup in android

Can someone tell me how to parse HTML content as XML using TagSoup in Android? I am looking for examples of functional codes, if possible.

+6
source share
2 answers
XMLReader xmlReader = XMLReaderFactory.createXMLReader ("org.ccil.cowan.tagsoup.Parser"); ContentHandler handler = new DefaultHandler () { public void startElement (String uri, String localName, String qName, Attributes attributes) throws SAXException { // ... } }; xmlReader.setContentHandler (handler); xmlReader.parse (new InputSource (input)); 
+6
source

Below is the code that a web page analysis tool should provide you with the Document created by TagSoup.

  HttpClient client = new DefaultHttpClient(); HttpGet request = new HttpGet("http://streak.espn.go.com/en/?date=20120824"); HttpResponse response = client.execute(request); // Check if server response is valid StatusLine status = response.getStatusLine(); if (status.getStatusCode() != 200) { throw new IOException("Invalid response from server: " + status.toString()); } // Pull content stream from response HttpEntity entity = response.getEntity(); InputStream inputStream = entity.getContent(); try { XMLReader parser = XMLReaderFactory.createXMLReader("org.ccil.cowan.tagsoup.Parser"); // Use the TagSoup parser to build an XOM document from HTML Document doc = new Builder(parser).build(builder.toString()); // Parse the document as needed Node node = doc.query("..."); } catch(IOException e) { ... } 
+3
source

Source: https://habr.com/ru/post/897203/


All Articles