How to lazily read a webpage in Clojure

Question

How to lazily read a webpage in Clojure

My friend and I recently implemented link grabbing in my IRC Clojure bot. When he sees the link, she opens the page and grabs the name from the page. The problem is that in order to capture the link, it must break the ENTIRE page.

How to read the page lazily before the first </title>?

+4

clojure lazy-evaluation networking

Rayne Apr 13 '10 at 11:42

source share

2 answers

I will not expect HTML to be necessarily split into lines in a reasonable way; without looking beyond our own backyard, for example. Compojure (or Hiccup at the moment, I think) doesn't bother inserting line breaks, I believe (update: just tested Hiccup - no line breaks).

Instead, I would suggest lazy XML parsing (with clojure.contrib.lazy-xml ) on top of java.io.BufferedInputStream .

+6

Michał Marczyk Apr 13 '10 at 14:40

source share

cgrand · Accepted Answer · 2010-04-13T12:26:12+0000

Use line-seq , but remember to close the underlying stream when done.

How to lazily read a webpage in Clojure

More articles: