I really will make the problem in the opposite direction: I will start with a monadic solution and return from it to a manual solution. This will result in the same code that you get if you reach the correct solution manually.
A typical signature of the type of monadic parsing is:
type Parser a = String -> Maybe (a, String)
Notice the slight difference with your form, where you have the final String on the outside of Maybe . Both are valid, but I prefer this form more because I believe that the remainders of the String invalid if the parsing failed.
This type is actually a special case of StateT , which is defined as:
newtype StateT sma = StateT { runStateT :: s -> m (a, s) }
Please note that if we choose:
s = String m = Maybe
... we return the Parser type:
type Parser a = StateT String Maybe a -- or: type Parser = StateT String Maybe
What's cool is that we just need to define one parser manually, which is a parser that extracts one character:
anyChar :: Parser Char anyChar = StateT $ \str -> case str of [] -> Nothing c:cs -> Just (c, cs)
Note that if we removed the StateT , anyChar would be the following:
anyChar :: String -> Maybe (Char, String)
When we complete it in StateT , it becomes:
anyChar :: StateT String Maybe Char
... which is just a Parser Char .
Once we get this primitive parser, we can define all the other parsers using the StateT Monad interface. For example, let's define a parser that matches a single character:
import Control.Monad char :: Char -> Parser () char c' = do c <- anyChar guard (c == c')
It was easy! guard requires an instance of MonadPlus for our monad, but we already have one. The reason is the following two instances of MonadPlus :
instance (MonadPlus m) => MonadPlus (StateT sm) where ... instance MonadPlus Maybe where ...
The combination of these two instances means that StateT s Maybe automatically implements MonadPlus , so we can use guard , and it is just magically βrightβ.
With these two parsers, your last parser becomes very easy to write:
data Wff = Or Char Char deriving (Show) parseWff :: Parser Wff parseWff = do char '[' a <- anyChar char '|' b <- anyChar char ']' return (Or ab)
This is much clearer and clearer what is happening. It also works:
>>> runStateT parseWff "[A|B]" Just (Or 'A' 'B',"")
Work back
This brings us to your original question: how do you write the same behavior? We will work backwards from Monad and MonadPlus to deduce what they do under the hood for us.
To do this, we need to infer that Monad and MonadPlus for StateT when its base monad Maybe . Let's start with a Monad instance for StateT :
instance (Monad m) => Monad (StateT sm) where return r = StateT (\s -> return (r, s)) m >>= f = StateT $ \s0 -> do (a, s1) <- runStateT m s0 runStateT (fa) s1
Note that it is defined in general terms of the base monad. This means that we need a Monad instance for Maybe to get what the previous code does:
instance Monad Maybe where return = Just m >>= f = case m of Nothing -> Nothing Just a -> fa
If we replace the monad Maybe instance with the monad StateT , we get:
instance Monad (StateT s Maybe) where return r = StateT (\s -> Just (r, s)) m >>= f = StateT $ \s0 -> case runStateT m s0 of Nothing -> Nothing Just (a, s1) -> runStateT (fa) s1
We can do the same to get a Monad instance for StateT s Maybe . We just need to take MonadPlus instances for StateT and `Maybe:
instance (MonadPlus m) => MonadPlus (StateT sm) where mzero = StateT (\_ -> mzero) mplus (StateT f) (StateT g) = StateT (\s -> mplus (fs) (gs)) instance MonadPlus Maybe where mzero = Nothing mplus m1 m2 = case m1 of Just a -> Just a Nothing -> case m2 of Just b -> Just b Nothing -> Nothing
... and combine them into one:
instance MonadPlus (StateT s Maybe) where mzero = StateT (\_ -> Nothing) mplus (StateT f) (StateT g) = StateT $ \s -> case fs of Just a -> Just a Nothing -> case gs of Just b -> Just b Nothing -> Nothing
Now we can find out what our parsers do under the hood. Let's start with the char analyzer:
char c' = do c <- anyChar guard (c == c')
This is the removal:
char c' = anyChar >>= \c -> guard (c == c')
We simply StateT s Maybe that (>>=) for StateT s Maybe , so replace this with:
char c' = StateT $ \str0 -> case runStateT anyChar str0 of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1
We already know the definition of anyChar , so replace this with:
char c' = StateT $ \str0 -> case runStateT (StateT $ \str -> case str of [] -> Nothing c:cs -> Just (c, cs) ) str0 of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1
We also know that runStateT is the inverse of StateT , so:
char c' = StateT $ \str0 -> case (\str -> case str of [] -> Nothing c:cs -> Just (c, cs) ) str0 of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1
Then we can apply the lambda to str0 :
char c' = StateT $ \str0 -> case (case str0 of [] -> Nothing c:cs -> Just (c, cs) ) of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1
Now we extend the external case statement to the internal case statement:
char c' = StateT $ \str0 -> case str0 of [] -> case Nothing of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1 c:cs -> case Just (c, cs) of Nothing -> Nothing Just (a, str1) -> runStateT ((\c -> guard (c == c')) a) str1
... and evaluate case statements:
char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> runStateT ((\c -> guard (c == c')) c) cs
Then we can apply the lambda to c :
char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> runStateT (guard (c == c')) cs
To simplify this, we need to know what the guard does. Here is the source code for it:
guard pred = if pred then return () else mzero
We already know that return and mzero for StateT s Maybe , so replace them with:
guard pred = if pred then StateT (\s -> Just ((), s)) else StateT (\_ -> Nothing)
Now we can embed this in our function:
char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> runStateT (if (c == c') then StateT (\s -> Just ((), s)) else StateT (\_ -> Nothing) ) cs
If we extend runStateT over what we get:
char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> (if (c == c') then (\s -> Just ((), s)) else (\_ -> Nothing) ) cs
Similarly, we can apply both branches to cs :
char c' = StateT $ \str0 -> case str0 of [] -> Nothing c:cs -> if (c == c') then Just ((), cs) else Nothing
To the equivalent code that we wrote manually, we did not use instances of Monad or MonadPlus at all.
Final parser
Now I repeat this process for the last function, but leave the output as an exercise for you:
parseWff = do char '[' a <- anyChar char '|' b <- anyChar char ']' return (Or ab) parseWff = StateT $ \str0 -> case str0 of [] -> Nothing c1:str1 -> if (c1 == '[') then case str1 of [] -> Nothing c2:str2 -> case str2 of [] -> Nothing c3:str3 -> if (c3 == '|') then case str3 of [] -> Nothing c4:str4 -> case str4 of [] -> Nothing c5:str5 -> if (c5 == ']') then Just (Or c2 c4, str5) else Nothing else Nothing else Nothing
... but we can simplify this even further:
parseWff = StateT $ \str0 -> case str0 of '[':c2:'|':c4:']':str5 -> Just (Or c2 c4, str5) _ -> Nothing
Please note that unlike the function you wrote, it does not use partial functions such as tail or incomplete pattern matching. Also, the code you wrote does not compile, but even if he did, it would still give the wrong behavior.