I need to read a large file (~ 1 GB), process it and save it in db. My solution looks like this:
data.txt
format: [id],[title]\n
1,Foo 2,Bar ...
code
(ns test.core (:require [clojure.java.io :as io] [clojure.string :refer [split]])) (defn parse-line [line] (let [values (split line #",")] (zipmap [:id :title] values))) (defn run [] (with-open [reader (io/reader "~/data.txt")] (insert-batch (map parse-line (line-seq reader))))) ; insert-batch just save vector of records into database
But this code does not work like that, because it parses all the rows first and then sends them to the database.
I think the ideal solution would be read line -> parse line -> collect 1000 parsed lines -> batch insert them into database -> repeat until there is no lines . Unfortunately, I do not know how to implement this.
source share