I would like to write a simple function that iterates through the lines of a text file. I believe in 2.8
:
def lines(filename: String) : Iterator[String] = { scala.io.Source.fromFile(filename).getLines }
and it was, but in 2.9
the above does not work, and instead I have to do:
def lines(filename: String) : Iterator[String] = { scala.io.Source.fromFile(new File(filename)).getLines() }
Now the problem is that I want to put together the above iterators in the understanding of for
:
for ( l1 <- lines("file1.txt"); l2 <- lines("file2.txt") ){ do_stuff(l1, l2) }
This is again used to work with 2.8
, but causes "too many open files", an exception to get a throw at 2.9
. This is understandable - the second lines
in understanding ends with the opening (and not closing) of the file for each line in the first.
In my case, I know that "file1.txt"
large, and I do not want to suck it into memory, but the second file is small, so I can write another linesEager
like this:
def linesEager(filename: String): Iterator[String] = val buf = scala.io.Source.fromFile(new File(filename)) val zs = buf.getLines().toList.toIterator buf.close() zs
and then turn my understanding into:
for (l1 <- lines("file1.txt"); l2 <- linesEager("file2.txt")){ do_stuff(l1, l2) }
It works, but clearly ugly. Can someone suggest a uniform and a clean way to achieve the above. It looks like you need a way for the iterator to return lines
to a close
file when it reaches the end, and this must have happened in 2.8
, and why did it work there?
Thanks!
BTW is the minimum version of the full program that shows the problem:
import java.io.PrintWriter import java.io.File object Fail { def lines(filename: String) : Iterator[String] = { val f = new File(filename) scala.io.Source.fromFile(f).getLines() } def main(args: Array[String]) = { val smallFile = args(0) val bigFile = args(1) println("helloworld") for ( w1 <- lines(bigFile) ; w2 <- lines(smallFile) ) { if (w2 == w1){ val msg = "%s=%s\n".format(w1, w2) println("found" + msg) } } println("goodbye") } }
In 2.9.0
I compile with scalac WordsFail.scala
, and then I get the following:
rjhala@goto :$ scalac WordsFail.scala rjhala@goto :$ scala Fail passwd words helloworld java.io.FileNotFoundException: passwd (Too many open files) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:120) at scala.io.Source$.fromFile(Source.scala:91) at scala.io.Source$.fromFile(Source.scala:76) at Fail$.lines(WordsFail.scala:8) at Fail$$anonfun$main$1.apply(WordsFail.scala:18) at Fail$$anonfun$main$1.apply(WordsFail.scala:17) at scala.collection.Iterator$class.foreach(Iterator.scala:652) at scala.io.BufferedSource$BufferedLineIterator.foreach(BufferedSource.scala:30) at Fail$.main(WordsFail.scala:17) at Fail.main(WordsFail.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at scala.tools.nsc.util.ScalaClassLoader$$anonfun$run$1.apply(ScalaClassLoader.scala:78) at scala.tools.nsc.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:24) at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.asContext(ScalaClassLoader.scala:88) at scala.tools.nsc.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:78) at scala.tools.nsc.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:101) at scala.tools.nsc.ObjectRunner$.run(ObjectRunner.scala:33) at scala.tools.nsc.ObjectRunner$.runAndCatch(ObjectRunner.scala:40) at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:56) at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:80) at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:89) at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)