How to get discogs release images?

I want to get images of Discogs releases. Can I do this without the Discogs APIs? They have no image links in their db dumps.

+5
source share
2 answers

To do this without an API, you will need to load a web page and extract the image from the html source code . You can find the corresponding page by downloading https://www.discogs.com/release/xxxx , where xxxx is the release number. Since html is just a text file, now you can extract the jpeg url.

I do not know what your programming language is, but I am sure that it can handle String functions such as indexOf and subString . You can extract the contents of the html OG:Image for the image.

So let's take an example: https://www.discogs.com/release/8140515

  • Find .indexOf("og:image\" content=\"); as startPos for some integer.
  • In the next 19 characters .indexOf(".jpg", startPos + 19); enter endPos .
    This is the first appearance of .jpg after startPos + 19 index any other characters.
  • Now extract the substring from the html text img_URL = myHtmlStr.substring(startPos+19, endPos);

  • You should end up with a string as shown below (extracted URL):
    https://img.discogs.com/_zHBK73yJ5oON197YTDXM7JoBjA=/fit-in/600x600/filters:strip_icc():format(jpeg):mode_rgb():quality(90)/discogs-images/R-814051546-1 .jpeg.jpg p>

  • The process can be shortened to finding the startPos index https://img. , then find the first occurrence of .jpg when searching after this index startPos . Extract this range of lengths. This is because the image URL is only mentioned in the html source at https://img.

Compare the page at https://www.discogs.com/release/8140515 with the extracted URL below.

R-8140515-1460073064-5890.jpeg.jpg

+5
source

Here's how to do it using the Java library and Jsoup .

  • get HTML release page
  • parse the HTML and get <meta property="og:image" content=".." /> to get the content value
 import java.io.IOException; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class DiscogRelease { private final String url; public DiscogRelease(String url) { this.url = url; } public String getImageUrl() { try { Document doc = Jsoup.connect(this.url).get(); Elements metas = doc.head().select("meta[property=\"og:image\"]"); if (!metas.isEmpty()) { Element element = metas.get(0); return element.attr("content"); } } catch (IOException ex) { Logger.getLogger(DiscogRelease.class.getName()).log(Level.SEVERE, null, ex); } return null; } } 
0
source

Source: https://habr.com/ru/post/1242040/


All Articles