Unplanned greedy behavior in uu-parsinglib

Problem

Today I ran into a problem, and I do not know how to solve it. This is very strange for me, because the code I wrote should (according to my current knowledge) be correct.

So, below you can find parser combinators. The most important of these is pOperator , which builds the AST operator in a very simple way (for demonstration purposes only). It consumes an "x" and can consume several "x" separated by spaces.

I have a pParens combinator that is defined as:

 pPacked pParenL (pWSpaces *> pParenR) 

therefore, it consumes Whitespaces before closing the brackets.

I / O Example

Correct input / output MUST be:

 in: "(x)" out: Single "x" in: "(x )" out: Single "x" 

but I get:

 in: "(x)" out: Single "x" in: "(x )" out: Multi (Single "x") (Single "x") -- Correcting steps: -- Inserted 'x' at position LineColPos 0 3 3 expecting one of ['\t', ' ', 'x'] 

but in the second example I get an error - and the parser behaves the same as greedy, eats some tokens (and there is no greedy operation).

I would be grateful for any help with this.

Code example

 import Prelude hiding(lex) import Data.Char hiding (Space) import qualified Text.ParserCombinators.UU as UU import Text.ParserCombinators.UU hiding(parse) import qualified Text.ParserCombinators.UU.Utils as Utils import Text.ParserCombinators.UU.BasicInstances hiding (Parser) data El = Multi El El | Single String deriving (Show) ---------- Example core grammar ---------- pElement = Single <$> pSyms "x" pOperator = applyAll <$> pElement <*> pMany (flip <$> (Multi <$ pWSpaces1) <*> pElement) ---------- Basic combinators ---------- applyAll x (f:fs) = applyAll (fx) fs applyAll x [] = x pSpace = pSym ' ' pTab = pSym '\t' pWSpace = pSpace <|> pTab pWSpaces = pMany pWSpace pWSpaces1 = pMany1 pWSpace pMany1 p = (:) <$> p <*> pMany p pSyms [] = pReturn [] pSyms (x : xs) = (:) <$> pSym x <*> pSyms xs pParenL = Utils.lexeme $ pSym '(' pParenR = Utils.lexeme $ pSym ')' pParens = pPacked pParenL (pWSpaces *> pParenR) ---------- Program ---------- pProgram = pParens pOperator -- if you replace it with following line, it works: -- pProgram = pParens pElement -- so it seems like something in pOperator is greedy tests = [ ("test", "(x)") , ("test", "(x )") ] ---------- Helpers ---------- type Parser a = P (Str Char String LineColPos) a parse ps = UU.parse ( (,) <$> p <*> pEnd) (createStr (LineColPos 0 0 0) s) main :: IO () main = do mapM_ (\(desc, p) -> putStrLn ("\n=== " ++ desc ++ " ===") >> run pProgram p) tests return () run :: Show t => Parser t -> String -> IO () run p inp = do let (a, errors) = parse p inp putStrLn ("-- Result: \n" ++ show a) if null errors then return () else do putStr ("-- Correcting steps: \n") show_errors errors putStrLn "-- " where show_errors :: (Show a) => [a] -> IO () show_errors = sequence_ . (map (putStrLn . show)) 

IMPORTANT

 pOperator = applyAll <$> pElement <*> pMany (flip <$> (Multi <$ pWSpaces1) <*> pElement) 

is equivalent to:

 foldr pChainl pElement (Multi <$ pWSpaces1) 

according to: Combinatorial analysis: short tutorial

And used to determine the priority of the operator.

+6
source share
1 answer

PMany's definition says:

 pMany :: IsParser p => pa -> p [a] pMany p = pList p 

and it offers a solution. When we see space, we do not have to jump immediately to the selection to continue with more x-es, so we define:

 pMany :: IsParser p => pa -> p [a] pMany_ng p = pList_ng p 

Of course, you can also call pList_ng right away. It would be even better to write:

 pParens (pChainr_ng (pMulti <$ pWSpaces1) px) -- 

I have not tested it since I am not sure that there should be at least one space between x-es, etc.

Doaitse

+1
source

Source: https://habr.com/ru/post/951946/


All Articles