The fundamental lazy I / O hGetContents
creates l <1> lazily - it only reads from the descriptor, as required, to create the parts of the line that your program really requires. However, after the handle has been closed, it is no longer possible to read from the handle, and if you try to check the part of the line that has not yet been read, you will get this exception. For example, suppose you write
main = do most <- withFile "myfile" ReadMode (\h -> do s <- hGetContents h let (first12,rest) = splitAt 12 s print first12 return rest) putStrLn most
GHC opens myfile
and sets it for lazy reading in a string bound to s
. In fact, it does not start reading from the file. He then sets up a lazy calculation to break the line after 12 characters. Then print
forces to compute, and GHC reads in a piece of myfile
with a length of at least 12 characters and displays the first twelve. Then it closes the file when withFile
completes, and tries to print the rest. If the file was longer than the GHC buffer with the buffer, you will get a read-delay exception when it reaches the end of the block.
How to avoid this problem
You need to be sure that you really read everything you need before closing the file or returning from withFile
. If the function you pass to withFile
just does a few I / O and returns a constant (like ()
), then you donβt have to worry about that. If you need to get real value from lazy reading, you must be sure to force this value before returning. In the above example, you can force a string into "normal form" using a function or operator from the Control.DeepSeq
module:
return $!! rest
This ensures that the rest of the line is actually read before withFile
closes the file. Approach $!!
also works great if you return some value computed from the contents of the file if it is an instance of the NFData
class. In this case, and many others, itβs even better to simply move the rest of the code to process the contents of the file into a function passed to withFile
, for example:
main = withFile "myfile" ReadMode (\h -> do s <- hGetContents h let (first12,rest) = splitAt 12 s print first12 putStrLn rest)
Another feature considered as an alternative is readFile
. readFile
keeps the file open until it finishes reading the file. You should use readFile
, however, if you know that you really require the entire contents of the file, otherwise you could leak file descriptors.
History
According to a Haskell report, when the handle is closed, the contents of the string become fixed.
In the past, the GHC simply ended the line at the end of what was buffered at the time the handle was closed. For example, if you checked the first 10 characters of a line before closing the handle, and the GHC buffered an additional 634 characters, but did not reach the end of the file, then you will get a regular line with 644 characters. This was a common source of confusion among new users and a random source of errors in production code.
As in GHC 7.10.1, this behavior is changing. When you close the handle that you are reading lazily, it now effectively puts an exception at the end of the buffer instead of the usual :""
. Therefore, if you try to check the line outside the point where the file was closed, you will receive an error message.