How to lazily read a webpage in Clojure

My friend and I recently implemented link grabbing in my IRC Clojure bot. When he sees the link, she opens the page and grabs the name from the page. The problem is that in order to capture the link, it must break the ENTIRE page.

How to read the page lazily before the first </title>?

+4
source share
2 answers

Use line-seq , but remember to close the underlying stream when done.

+6
source

I will not expect HTML to be necessarily split into lines in a reasonable way; without looking beyond our own backyard, for example. Compojure (or Hiccup at the moment, I think) doesn't bother inserting line breaks, I believe (update: just tested Hiccup - no line breaks).

Instead, I would suggest lazy XML parsing (with clojure.contrib.lazy-xml ) on top of java.io.BufferedInputStream .

+6
source

Source: https://habr.com/ru/post/1306788/


All Articles