I don’t really have a solution to the problem, but maybe some kind of intuition can help you simplify the creation of applicative parsers. When it comes to application, there are two types of “sequences” to consider:
- Sequencing parsing operations: this is what determines the parser writing order.
- Base Values Sequence: This one is more flexible as you can combine them in any order you like.
When the two sequences correspond well to each other, the result is a very nice and compact representation of the parser in applicative notation. For instance:
data Infix = Infix Double Operator Double infix = Infix <$> number <*> operator <*> number
The problem is that when the sequence does not match exactly, you have to massage the base values so that things can work (you cannot change the order of the parsers):
number = f <$> sign <*> decimal <*> exponent where f sign decimal exponent = sign * decimal * 10 ^^ exponent
Here, to calculate the number, you need to make a somewhat nontrivial combination of operations that is performed by the local function f .
Another typical situation is that you need to drop some value:
exponent = oneOf "eE" *> integer
Here *> discards the value on the left, keeping the value on the right. The <* operator does the opposite, discarding the right and retaining the left. When you have a chain of such operations, you should decode them using left associativity:
p1 *> p2 <* p3 *> p4 <* p5 ≡ (((p1 *> p2) <* p3) *> p4) <* p5
This is artificially far-fetched: you do not want to do this at all. It is better to break the expression into meaningful parts (and preferably give meaningful names). One common template that you will see:
-- discard the result of everything except `p3` p1 *> p2 *> p3 <* p4 <* p5
However, there is a small caveat if you want to apply something else to p3 or if p3 consists of several parts, you will have to use parentheses:
-- applying a pure function f <$> (p1 *> p2 *> p3 <* p4 <* p5) ≡ p1 *> p2 *> (f <$> p3) <* p4 <* p5 -- p3 consists of multiple parts p1 *> p2 *> (p3' <*> p3'') <* p4 <* p5)
Again, in these situations, it is often better to simply split the expression into meaningful fragments with names.
Attributive notation in a sense forces you to divide parsers into logical fragments so that they are easier to read, unlike monadic notations, where you could do everything in one monolithic block.