Difference between munch and many (satisfy p)?

In the module Text.ParserCombinators.ReadP , munch (and munch1 ), the documentation reads:

Parses the first zero or more characters matching a predicate. Always successful, exactly once, consuming all the characters. Therefore, NOT the same as (many (satisfy p)) .

How do they differ?

+6
source share
1 answer

First, find a counter example and let us use the Haskell tools to do this automatically. The QuickCheck library can give us such a counterexample very quickly:

 import Data.Function (on) import Text.ParserCombinators.ReadP import Test.QuickCheck prop_ParseTest :: String -> Property prop_ParseTest input = on (===) runParser (munch pred) (many (satisfy pred)) where runParser :: ReadP a -> [(a, String)] runParser = flip readP_to_S input -- run a given parser on a given input pred :: Char -> Bool pred = (> 'A') main :: IO () main = quickCheck prop_ParseTest 

We ask him to check whether the two parsers much pred and many (satisfy pred) . QuickCheck immediately discovers that they are different, and tries to create the shortest counter example possible:

 *** Failed! Falsifiable (after 5 tests and 2 shrinks): [("a","")] /= [("","a"),("a","")] 

So, munch pred always consumes 'a' unconditionally, and many (satisfy pred) gives a non-deterministic answer - it may or should not consume 'a' .

For example, consider starting the following two parsers in the string "abcd" :

 test1 = munch (> 'A') >> string "cd" test2 = many (satisfy (> 'A')) >> string "cd" 

The first crashes because munch consumes the entire string, and then it is not possible to match "cd" . The second is successful because many (satisfy ...) creates all possible branches

 [("","abcd"),("a","bcd"),("ab","cd"),("abc","d"),("abcd","")] 

and string "cd" is executed on a branch that consumes "ab" .

+7
source

Source: https://habr.com/ru/post/976771/


All Articles