Reusing getContents using getChar

Question

Reusing getContents using getChar

In my quest for capturing a lazy IO in Haskell, I tried the following:

main = do chars <- getContents consume chars consume :: [Char] -> IO () consume [] = return () consume ('x':_) = consume [] consume (c : rest) = do putChar c consume rest

which is just an echo of all the characters typed in stdin until I hit "x".

So, I naively thought that it should be possible to override getContents using getChar to do something in the following lines:

 myGetContents :: IO [Char] myGetContents = do c <- getChar -- And now? return (c: ???)

It turns out this is not so simple, as for ??? a function like IO [Char] -> [Char] is required, which, I think, will violate the whole idea of the IO monad.

Checking the implementation of getContents (or rather hGetContents ) reveals a whole sausage factory of dirty IO materials. Is my assumption correct that myGetContents cannot be implemented without using dirty, i.e. monodal, code?

+5

haskell lazy-io

johanneslink Dec 03 '16 at 17:53

source share

3 answers

You really should avoid using anything in System.IO.Unsafe , if at all possible. They tend to kill referential transparency and are not common functions used by Haskell if absolutely necessary.

If you change your signature type, I suspect that you might get a more idiomatic approach to your problem.

 consume :: Char -> Bool consume 'x' = False consume _ = True main :: IO () main = loop where loop = do c <- getChar if consume c then do putChar c loop else return ()

+1

bojo Dec 04 '16 at 0:20

source share

You can do this without any hacks.

If your goal is just to read all the stdin in String , you don't need any unsafe* functions.

IO is Monad, and Monad is an applicative functor. The functor is defined by the fmap function, whose signature is:

 fmap :: Functor f => (a -> b) -> fa -> fb

which satisfies these two laws:

 fmap id = id fmap (f . g) = fmap f . fmap g

Effectively fmap applies the function to wrapped values.

For a specific character 'c' , what is the type of fmap ('c':) ? We can write two types down and then combine them:

 fmap :: Functor f => (a -> b ) -> fa -> fb ('c':) :: [Char] -> [Char] fmap ('c':) :: Functor f => ([Char] -> [Char]) -> f [Char] -> f [Char]

Recalling that IO is a functor, if we want to define myGetContents :: IO [Char] , it seems reasonable to use this:

 myGetContents :: IO [Char] myGetContents = do x <- getChar fmap (x:) myGetContents

This is close, but not quite equivalent to getContents , as this version will try to read past the end of the file and throw an error instead of returning a line. Just by looking at it, you need to make it clear: there is no way to return a specific list, only an endless chain of cons. Knowing that the specific case "" in EOF (and using the infix <$> syntax for fmap ) leads us to:

 import System.IO myGetContents :: IO [Char] myGetContents = do reachedEOF <- isEOF if reachedEOF then return [] else do x <- getChar (x:) <$> myGetContents

The applicative class provides (slight) simplification.

Recall that IO is an applicative functor, not some old functor. There are “Applicable laws” associated with this type, as well as “Functor-laws”, but we will look specifically at <*> :

 <*> :: Applicative f => f (a -> b) -> fa -> fb

This is almost identical to fmap (aka <$> ), except that the function used is also wrapped. Then we can avoid snapping in our else clause using the applicative style:

 import System.IO myGetContents :: IO String myGetContents = do reachedEOF <- isEOF if reachedEOF then return [] else (:) <$> getChar <*> myGetContents

One modification is required if the input can be infinite.

Remember when I said you don't need unsafe* functions if you just want to read all stdin in String ? Well, if you just want to contribute, you will. If your entry can be infinitely long, you will definitely do it. The final program is distinguished by one import and one word:

 import System.IO import System.IO.Unsafe myGetContents :: IO [Char] myGetContents = do reachedEOF <- isEOF if reachedEOF then return [] else (:) <$> getChar <*> unsafeInterleaveIO myGetContents

The defining function of the lazy IO is unsafeInterleaveIO (from System.IO.Unsafe ). This delays the calculation of the IO action until it is needed.

0

Fox Feb 17 '18 at 22:19

source share

Reid barton · Accepted Answer · 2016-12-03T18:16:28+0000

You need a new primitive unsafeInterleaveIO :: IO a -> IO a , which delays the execution of the action of its argument until the result of this action is evaluated. Then

 myGetContents :: IO [Char] myGetContents = do c <- getChar rest <- unsafeInterleaveIO myGetContents return (c : rest)

Reusing getContents using getChar

You can do this without any hacks.

The applicative class provides (slight) simplification.

One modification is required if the input can be infinite.

More articles: