Extending a thumbnail related to a news article

I'm not sure if I flag this question correctly, but I know a lot of news apps like Pulse, Google Currents, etc., by pulling in a thumbnail of a news article. My question is that I assume that they pull content from the news site itself, either scrape the screen, or consume some kind of feed. How do they know which image can be pulled from the site?

I put this with the "android" tag because I created a news reader application that I would like to put in a thumbnail. Thanks.

+4
source share
2 answers

I created such a thing myself a while ago using this approach:

  • Process an article with a Readability map (for Java, Google found jReadability , Snacktory and Java-readability - there are probably more there, one of which should work on Android as well).
  • In the processed article, grab the first image using some DOM structure to jump to the first img tag. Since the article is β€œclean,” it is usually a useful hit.

I would recommend processing articles on the server, but not on the phone.

+1
source

This article discusses various methods.

A good example of thumbnail extraction is pre-generated on reddit . Details on how reddit identifies and displays thumbnails are here and here .

+1
source

Source: https://habr.com/ru/post/1443826/


All Articles