Android: parse html from page

I would like to parse the text from the page.

Is there an easy way to store product information in a string, for example? Url example: http://upcdata.info/upc/7310870008741

thanks

+4
source share
3 answers

Jsoup is well versed in plain HTML from Android applications:

http://jsoup.org/

To get the page, simply do the following:

URL url = new URL("http://upcdata.info/upc/7310870008741"); Document document = Jsoup.parse(url, 5000); 

Then you can parse everything you need from the Document . Check out this link for a brief description of how to extract part of the page:

http://jsoup.org/cookbook/extracting-data/dom-navigation

+7
source

If you want to read the url in the line:

 StringBuffer myString = new StringBuffer(); try { String thisLine; URL u = new URL("http://www.google.com"); DataInputStream theHTML = new DataInputStream(u.openStream()); while ((thisLine = theHTML.readLine()) != null) { myString.append(thisLine); } } catch (MalformedURLException e) { } catch (IOException e) { } // call toString() on myString to get the contents of the file your URL is // pointing to. 

This will give you a simple old line, HTML markup and all.

+2
source
 String tmpHtml = "<html>a whole bunch of html stuff</html>"; String htmlTextStr = Html.fromHtml(tmpHtml).toString(); 
+1
source

Source: https://habr.com/ru/post/1335556/


All Articles