Maybe you can use span
and duplicate
efficiently. Assuming the iterator is positioned at the beginning of the batch, you take a span before the next batch, duplicate it so that you can count the pages, write the modified row of the batch, and then write the pages using the duplicated iterator. Then we process the next batch recursively ...
def batch(i: Iterator[String]) { if (i.hasNext) { assert(i.next() == "batch") val (current, next) = i.span(_ != "batch") val (forCounting, forWriting) = current.duplicate val count = forCounting.filter(_ == "p").size println("batch " + count) forWriting.foreach(println) batch(next) } }
Assuming the following input:
val src = Source.fromString("head\nbatch\np\np\nbatch\np\nbatch\np\np\np\n")
You position the iterator at the beginning of the batch, and then process the batch:
val (head, next) = src.getLines.span(_ != "batch") head.foreach(println) batch(next)
Fingerprints:
head batch 2 p p batch 1 p batch 3 p p p
source share