Error Detection and Reporting Using Maybe

Question

Error Detection and Reporting Using Maybe

I am writing a propositional logic parser in Haskell. Now I am doing parsing as a training exercise. In the end I will do the parsec. Meanwhile, I'm trying to bow my head to Monad. In particular, I use Maybe to report errors from my parse function. My current problem is related to part of the helper function:

 parse' :: String -> (Maybe Wff, String) parse' ('[':rest) = (x, if null rest'' then "" else tail rest'') where (a, rest') = parse' rest (b, rest'') = parse' (if null rest' then "" else tail rest') x = if null rest' || null rest'' || head rest' /= '|' || head rest'' /= ']' then Nothing else Or <$> a <*> b

(For reference, the full parse function can be found here .)

This code analyzes a sentence of the form [ A | B ] [ A | B ] , where A and B are any arbitrary sentences. As you can see, I use the applicative style on the last line to propagate the result of Nothing if the previous recursive call results in Nothing . This allowed me to remove a == Nothing and b == Nothing from the if condition. How can I use an Applicative or Monad Maybe instance to break up the rest of this if ?

+4

parsing haskell text-parsing

Code-apprentice Jun 12 '13 at 0:47

source share

3 answers

I really will make the problem in the opposite direction: I will start with a monadic solution and return from it to a manual solution. This will result in the same code that you get if you reach the correct solution manually.

A typical signature of the type of monadic parsing is:

 type Parser a = String -> Maybe (a, String)

Notice the slight difference with your form, where you have the final String on the outside of Maybe . Both are valid, but I prefer this form more because I believe that the remainders of the String invalid if the parsing failed.

This type is actually a special case of StateT , which is defined as:

 newtype StateT sma = StateT { runStateT :: s -> m (a, s) }

Please note that if we choose:

 s = String m = Maybe

... we return the Parser type:

 type Parser a = StateT String Maybe a -- or: type Parser = StateT String Maybe

What's cool is that we just need to define one parser manually, which is a parser that extracts one character:

 anyChar :: Parser Char anyChar = StateT $ \str -> case str of [] -> Nothing c:cs -> Just (c, cs)

Note that if we removed the StateT , anyChar would be the following:

 anyChar :: String -> Maybe (Char, String)

When we complete it in StateT , it becomes:

 anyChar :: StateT String Maybe Char

... which is just a Parser Char .

Once we get this primitive parser, we can define all the other parsers using the StateT Monad interface. For example, let's define a parser that matches a single character:

 import Control.Monad char :: Char -> Parser () char c' = do c <- anyChar guard (c == c')

It was easy! guard requires an instance of MonadPlus for our monad, but we already have one. The reason is the following two instances of MonadPlus :

 instance (MonadPlus m) => MonadPlus (StateT sm) where ... instance MonadPlus Maybe where ...

The combination of these two instances means that StateT s Maybe automatically implements MonadPlus , so we can use guard , and it is just magically “right”.

With these two parsers, your last parser becomes very easy to write:

 data Wff = Or Char Char deriving (Show) parseWff :: Parser Wff parseWff = do char '[' a <- anyChar char '|' b <- anyChar char ']' return (Or ab)

This is much clearer and clearer what is happening. It also works:

 >>> runStateT parseWff "[A|B]" Just (Or 'A' 'B',"")

Work back

This brings us to your original question: how do you write the same behavior? We will work backwards from Monad and MonadPlus to deduce what they do under the hood for us.

To do this, we need to infer that Monad and MonadPlus for StateT when its base monad Maybe . Let's start with a Monad instance for StateT :

 instance (Monad m) => Monad (StateT sm) where return r = StateT (\s -> return (r, s)) m >>= f = StateT $ \s0 -> do (a, s1) <- runStateT m s0 runStateT (fa) s1

Note that it is defined in general terms of the base monad. This means that we need a Monad instance for Maybe to get what the previous code does:

 instance Monad Maybe where return = Just m >>= f = case m of Nothing -> Nothing Just a -> fa

If we replace the monad Maybe instance with the monad StateT , we get:

 instance Monad (StateT s Maybe) where return r = StateT (\s -> Just (r, s)) m >>= f = StateT $ \s0 -> case runStateT m s0 of Nothing -> Nothing Just (a, s1) -> runStateT (fa) s1

We can do the same to get a Monad instance for StateT s Maybe . We just need to take MonadPlus instances for StateT and `Maybe:

 instance (MonadPlus m) => MonadPlus (StateT sm) where mzero = StateT (\_ -> mzero) mplus (StateT f) (StateT g) = StateT (\s -> mplus (fs) (gs)) instance MonadPlus Maybe where mzero = Nothing mplus m1 m2 = case m1 of Just a -> Just a Nothing -> case m2 of Just b -> Just b Nothing -> Nothing

... and combine them into one:

 instance MonadPlus (StateT s Maybe) where mzero = StateT (\_ -> Nothing) mplus (StateT f) (StateT g) = StateT $ \s -> case fs of Just a -> Just a Nothing -> case gs of Just b -> Just b Nothing -> Nothing

Now we can find out what our parsers do under the hood. Let's start with the char analyzer:

 char c' = do c <- anyChar guard (c == c')

This is the removal:

 char c' = anyChar >>= \c -> guard (c == c')

We simply StateT s Maybe that (>>=) for StateT s Maybe , so replace this with:

 char c' = StateT $ \str0 -> case runStateT anyChar str0 of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1

We already know the definition of anyChar , so replace this with:

 char c' = StateT $ \str0 -> case runStateT (StateT $ \str -> case str of [] -> Nothing c:cs -> Just (c, cs) ) str0 of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1

We also know that runStateT is the inverse of StateT , so:

 char c' = StateT $ \str0 -> case (\str -> case str of [] -> Nothing c:cs -> Just (c, cs) ) str0 of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1

Then we can apply the lambda to str0 :

 char c' = StateT $ \str0 -> case (case str0 of [] -> Nothing c:cs -> Just (c, cs) ) of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1

Now we extend the external case statement to the internal case statement:

 char c' = StateT $ \str0 -> case str0 of [] -> case Nothing of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1 c:cs -> case Just (c, cs) of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1

... and evaluate case statements:

 char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> runStateT ((\c -> guard (c == c')) c) cs

Then we can apply the lambda to c :

 char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> runStateT (guard (c == c')) cs

To simplify this, we need to know what the guard does. Here is the source code for it:

 guard pred = if pred then return () else mzero

We already know that return and mzero for StateT s Maybe , so replace them with:

 guard pred = if pred then StateT (\s -> Just ((), s)) else StateT (\_ -> Nothing)

Now we can embed this in our function:

 char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> runStateT (if (c == c') then StateT (\s -> Just ((), s)) else StateT (\_ -> Nothing) ) cs

If we extend runStateT over what we get:

 char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> (if (c == c') then (\s -> Just ((), s)) else (\_ -> Nothing) ) cs

Similarly, we can apply both branches to cs :

 char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> if (c == c') then Just ((), cs) else Nothing

To the equivalent code that we wrote manually, we did not use instances of Monad or MonadPlus at all.

Final parser

Now I repeat this process for the last function, but leave the output as an exercise for you:

 parseWff = do char '[' a <- anyChar char '|' b <- anyChar char ']' return (Or ab) parseWff = StateT $ \str0 -> case str0 of [] -> Nothing c1:str1 -> if (c1 == '[') then case str1 of [] -> Nothing c2:str2 -> case str2 of [] -> Nothing c3:str3 -> if (c3 == '|') then case str3 of [] -> Nothing c4:str4 -> case str4 of [] -> Nothing c5:str5 -> if (c5 == ']') then Just (Or c2 c4, str5) else Nothing else Nothing else Nothing

... but we can simplify this even further:

 parseWff = StateT $ \str0 -> case str0 of '[':c2:'|':c4:']':str5 -> Just (Or c2 c4, str5) _ -> Nothing

Please note that unlike the function you wrote, it does not use partial functions such as tail or incomplete pattern matching. Also, the code you wrote does not compile, but even if he did, it would still give the wrong behavior.

+5

Gabriel gonzalez Jun 12 '13 at 4:12

source share

Based on the answer from @TikhonJelvis, I updated all my parse function. (The parse' function from OP is in the where parse clause.) The first version uses the notation and `guard

 parse :: String -> Maybe Wff parse s = do (x, rest) <- parse' s guard $ null rest Just x where parse' ('~':rest) = do guard . not $ null rest (a, rest') <- parse' rest Just (Not a, rest') parse' ('[':rest) = do guard . not $ null rest (a, rest') <- parse' rest guard . not $ null rest' guard $ head rest' == '|' (b, rest'') <- parse' $ tail rest' guard . not $ null rest'' guard $ head rest'' == ']' Just (a `Or` b, tail rest'') parse' (c:rest) = do guard $ isLower c Just (Var c, rest) parse' [] = Nothing

Some additional experiments helped me understand that I can replace all but one using guard with direct pattern matching:

 parse :: String -> Maybe Wff parse s = do (x, "") <- parse' s Just x where parse' ('~':rest) = do (a, rest') <- parse' rest Just (Not a, rest') parse' ('[':rest) = do (a, ('|':rest')) <- parse' rest (b, (']':rest'')) <- parse' rest' Just (a `Or` b, rest'') parse' (c:rest) = do guard $ isLower c Just (Var c, rest) parse' [] = Nothing

0

Code-apprentice Jun 13 '13 at 23:34

source share

Tikhon jelvis · Accepted Answer · 2013-06-12T01:56:51+0000

You can use a function from Control.Monad called guard . This is a bit weird type:

 guard :: MonadPlus m => Bool -> m ()

MonadPlus covers all monads that have an "empty" case. For lists, this is [] ; for Maybe it's Nothing . guard takes the value boolean; if it is False , it evaluates this empty value; otherwise, it evaluates to return () . This behavior is mostly useful in do notation:

 x = do guard (not $ null rest' || null rest'' || head rest' /= '|' || head rest'' /= ']') Or <$> a <*> b

What is going on here is simple. If the condition evaluates to True , guard returns Just () , which is then ignored in favor of Or <$> a <*> b (since that means the do notation works). However, if the condition is False , guard returns Nothing , which propagates through the rest of the do notation to give you the final result, Nothing : exactly what you wanted.

To make the code more readable, I would also pull the condition into my own variable in the where block.

Error Detection and Reporting Using Maybe

Work back

Final parser

More articles: