Many pages (facebook, google +, etc.) have a function that creates a summary with a title, image and some text by link. I tried to find out if there were any libraries or recommendations on how to do this function, but my search results were not at all useful.
Such a function is usually created using some kind of "workaround", which means that your script opens the link and looks at its data. Just like you offer yourself.
I know that I can parse html pages and extract elements I would like, but I think there should be some standard in how to do this (maybe also how to create pages that are friendly to this kind of functionality.
The standard way is how most search engines like Google do it. You get a title from the name of the site, a description from the description, if any. Most search engines now ignore description metadata and instead try to create their own resume.
This is usually done by searching for headings (h1, h2, etc.) and then paragraphs.
And to make the site βFriendlyβ for this kind of workaround, you create your site in accordance with web standards ( W3C ).
Anyone who has a good link that will point me in the right direction? Javascript or .Net is my preferred choice, but I can implement it too.
A language does not really matter if it is capable of performing some basic HTTP-GET.
source share