Word
matches an empty string because in
word : ESPACE* TEXT ESPACE* ;
TEXT corresponds to an empty line that calls
rawText : word+ ;
for an infinite loop.
Edit
TEXT : ~(' '|'['|']')*;
to
TEXT : ~(' '|'['|']')+;
that will make your grammar only completely ambiguous.
The way to think about this is that rawText can match an empty string in many ways.
- Zero Text Tones
- One TEXT token with a length of 0.
- Two TEXT tokens with a length of 0.
- Three TEXT tokens with a length of 0.
- ...
This manifests itself when you have a syntax error ( [i]
), because it tries each of these alternatives to see if any of them resolve the error.
To get rid of any quadratic behavior, you must make it absolutely unique.
rawText : ign (word (ign word)*)? ign; ign : ESPACE*; word : TEXT;
The problem with the naive fix is ββthat rawText can match "foo"
several ways:
TEXT("foo")
TEXT("fo"), ESPACE(""), TEXT("o")
TEXT("f"), ESPACE(""), TEXT("oo")
TEXT("f"), ESPACE(""), TEXT("o"), ESPACE(""), TEXT("o")
source share