When stress testing some Clojure code at work, I noticed that it ends from a heap of space when iterating over large data sets. In the end, I was able to track down problems to a combination of the Clojure doseq function and implementation for lazy sequences.
This is the smallest piece of code that causes Clojure to crash by running out of available heap space:
(doseq [e (take 1000000000 (iterate inc 1))] (identity e))
The doseq documentation clearly states that it does not preserve the head of a lazy sequence, so I would expect the memory complexity above the code to be closer to O (1). Is there something I'm missing? What is the Clojure-idiomatic way to iterate over very large lazy sequences if doseq not up to work?
source share