In a nutshell, although you have a list of tokens, you still need a parser.
Parsec may handle alternative token streams, but you may have to refer to the manual - the PDF file available on Daan Leijen's "legacy" homepage is http://legacy.cs.uu.nl/daan/download/parsec/parsec. pdf . You can scroll your own parser without using the combinator library, but you will repurpose the Parsec part. As far as I remember, UU_parsing expects to work with a separate scanner, so its another option.
Although it does not handle parsing, you can find Lennart Augsson's “Lambda calculus, cooked in four ways”, useful for other things - http://www.augustsson.net/Darcs/Lambda/top.pdf
Change - here is a partially developed plan of how you can do this with Parsec, for more information you will need to refer to section 2.11 of the manual.
Suppose you have this data type for specific "internal" tokens:
data InternalTok = Ident String | BinOpPlus | BinOpMinus | UnaryOpNegate | IntLiteral Int deriving (Show)
You will then get these types for the Parsec token and do the parsing:
type MyToken = Token InternalTok type MyParser a = GenParser MyToken () a
Define the helper function according to the Parsec manual - this is the show and pos handle, so the individual definitions are shorter than cf. mytoken on page 19.
mytoken :: (MyToken -> Maybe a) -> MyParser a mytoken test = token showToken posToken testToken where showToken tok = show tok posToken tok = no_pos testToken tok = test tok
At the moment, your token does not track the initial position, therefore:
no_pos :: SourcePos no_pos = newPos "" 0 0 0
For each terminal, you must define a token function:
identifier :: MyParser MyToken identifier = mytoken (\tok -> case tok of a@ (Prefix (Ident _)) -> Just a _ -> Nothing) intLiteral :: MyParser MyToken intLiteral = mytoken (\tok -> case tok of a@ (Prefix (IntLiteral _)) -> Just a _ -> Nothing) binPlus :: MyParser MyToken binPlus = mytoken (\tok -> case tok of a@ (Infix BinOpPlus _ _) -> Just a _ -> Nothing) binMinus :: MyParser MyToken binMinus = mytoken (\tok -> case tok of a@ (Infix BinOpMinus _ _) -> Just a _ -> Nothing) unaryNegate :: MyParser MyToken unaryNegate = mytoken (\tok -> case tok of a@ (Prefix UnaryNegate _ _) -> Just a _ -> Nothing)
Edit - to process custom infix operators, you will need these paired parsers:
tokInfixL :: Int -> MyParser MyToken tokInfixL n = mytoken $ \tok -> case tok of a@ (Infix _ i InfixL) | i == n -> Just a _ -> Nothing) tokInfixR :: Int -> MyParser MyToken tokInfixR n = mytoken $ \tok -> case tok of a@ (Infix _ i InfixR) | i == n -> Just a _ -> Nothing) tokInfixC :: Int -> MyParser MyToken tokInfixC n = mytoken $ \tok -> case tok of a@ (Infix _ i InfixC) | i == n -> Just a _ -> Nothing) tokPrefix :: MyParser MyToken tokPrefix = mytoken (\tok -> case tok of a@ (Prefix _) -> Just a _ -> Nothing)
Now you can define a parser - you need to set the number of priority levels in advance, there is no way to get around this fact, since you need to encode a parser for each level.
Parsing top-level syntax just calls the highest priority parser
pExpression :: Parser Expersion pExpression = expression10
For each level of preliminary analysis, you need a parser like this, you will have to work independently without participation. You may also need to do some work on chainl / chainr - I only wrote a parser in this style with UU_Parsing, this may be slightly different for Parsec. Note. It is used, as a rule, at the top priority level.
expression10 :: Parser Expression expression10 = Apply <$> identifier <*> pExpression <|> Prefix <$> tokPrefix <*> pExpression <|> chainl (Infix <$> tokInfixL 10) expression9 <|> chainr (Infix <$> tokInfixR 10) expression9 expression9 :: Parser Expression expression9 = Prefix <$> tokPrefix <*> pExpression <|> chainl (Infix <$> tokInfixL 9) expression8 <|> chainr (Infix <$> tokInfixR 9) expression8 ...
You need to expand your syntax to handle IntLiterals and identifiers that are at level 0 in priority:
expression0 :: Parser Expression expression0 = IntLit <$> intLiteral <|> Ident <$> identifier <|> ...
Change - for unlimited priority - perhaps if you only have the application and Atom, something like this might work. Note that you will have to modify the tokInfixL and tokInfixR partisans so that they no longer correspond to level-dependencies, and you may have to experiment with the order of alternatives.
expression :: Parser Expression expression = Apply <$> identifier <*> expression <|> Prefix <$> tokPrefix <*> expression <|> chainl (Infix <$> tokInfixL) expression <|> chainr (Infix <$> tokInfixR) expression <|> intLiteral <|> identifier intLiteral :: Parser Expression intLiteral = Atom . convert <$> intLiteral where convert = ?? identifier :: Parser Expression identifier = Atom . convert <$> intLiteral where convert = ??