How does reverse search affect the language recognized by the parser?

A non-school issue, but it appears in the Dragon book (Compiled: Principles, Methods and Tools) in the exercise:

Grammar:

S :: = aSa | aa

generates all strings of length length except an empty string.

a) Build a recursive descent parser with a countdown for this grammar that tries an alternative aSa before aa. Show that the procedure for S succeeds at 2, 4, or 8 a, but fails at 6 a. b) What language does your parser recognize?

I'm at a dead end. It seems that if 4 a is recognized as S, and two a between S is recognized, then 6 a should be recognized. I tried to implement a parser in C, but this one recognizes all the even numbers a. This is not an inability to recognize 6 a. What does this exercise mean?

/* AC implementation of Exercise 4.13 in the Dragon Book */ /* The grammar: S ::= aSa | aa */ /* Construct a recursive-descent parser with backtracking for this grammar that tries the alternative aSa before aa. Show that the procedure for S succeeds on 2, 4, or 8 a's, but fails on 6 a's. */ #include <string.h> #include <stdio.h> int S(const char *str, int start, int end); int aSa(const char *str, int start, int end); int aa(const char *str, int start, int end); /* returns 1 if a match, 0 otherwise */ int S(const char *str, int start, int end) { if(aSa(str, start, end)) return 1; else if(aa(str, start, end)) return 1; return 0; } /* returns 1 if a match, 0 otherwise */ int aSa(const char *str, int start, int end) { int len = end - start; if (len < 3) return 0; if(str[0] != 'a') return 0; if (!S(str, start+1, end-1)) return 0; if(str[len-1] != 'a') return 0; return 1; } /* returns 1 if a match, 0 otherwise */ int aa(const char *str, int start, int end) { int len = end - start; if(len != 2) return 0; if(str[0] == str[1] && str[0] == 'a') return 1; return 0; } int main() { char str[20]; printf("Enter a string: \n"); scanf("%s", str); int match = S(str, 0, strlen(str)); if(match) printf("The string %s matches\n", str); else printf("The string %s does not match\n", str); return 0; } 
+6
source share
4 answers

Even with reverse tracking, which requires the ability to rewind the input stream, the recursive descent parser is not allowed to look forward to the end of the input and is not allowed to delete characters from both ends of the stream.

The parser from left to right should be able to work with the input stream, which has only one method:

 get() : consume and read one symbol, or return an EOF symbol. 

The backtracking version needs a stream in two ways:

 posn = tell() : return an opaque value which can be used in seek() seek(posn) : reposition the stream to a previous position returned by tell() 
+2
source

The problem is not that it is a recursive descent parser; the problem is that the described implementation does not correctly consider the external context of the analysis of recursive descent. This is similar to the difference between an LL parser (SLL) and an LL parser.

The shortest entry for which weird behavior is aaaaaa is aaaaaa .

  • We start with rule S and match 1 st a .
  • We call S
    • We match 2 nd a .
    • We call S I omit the specific steps, but the key is the call to S corresponds to aaaa , which is 3 rd a through the end of the input. (See note below).
    • We try to match a , but since the end of the input has already been reached, we return and match only 2 nd through 3 rd aa .
  • We match 4 th a .

An additional note about the S inner call that corresponds to aaaa : if we knew to reserve a at the end of the input for step 3, then the S inner call could match aa instead of aaaa , which would lead to a successful analysis of the full input aaaaaa . ANTLR 4 provides this “full context” syntax ability in the recursive descent parser and is the first LL with recursive descent parser to correctly match aa instead of aaaa for this nested S call.

The SLL parser corresponds to this 2k grammar. A corresponding LL analyzer (such as ANTLR 4) corresponds to 2k for this grammar.

+4
source

I'm not going to write this in c for fun, but here the parser is written in python, as simple as I can do it (I hope that it becomes clear as pseudo-code even if you don't know this language):

 class Backtrack(Exception): pass def asa(input): if input[0:1] == 'a': parsed, remaining = s(input[1:]) if remaining[0:1] == 'a': return 'a' + parsed + 'a', remaining[1:] raise Backtrack def aa(input): if input[0:2] == 'aa': return 'aa', input[2:] raise Backtrack def s(input): try: return asa(input) except Backtrack: return aa(input) for i in range(17): print(i, ': ', end='') try: print(s('a' * i)) except Backtrack: print('failed') 

and the results as length: (parsed, remaining) :

 0 : failed 1 : failed 2 : ('aa', '') 3 : ('aa', 'a') 4 : ('aaaa', '') 5 : ('aa', 'aaa') 6 : ('aaaa', 'aa') 7 : ('aaaaaa', 'a') 8 : ('aaaaaaaa', '') 9 : ('aa', 'aaaaaaa') 10 : ('aaaa', 'aaaaaa') 11 : ('aaaaaa', 'aaaaa') 12 : ('aaaaaaaa', 'aaaa') 13 : ('aaaaaaaaaa', 'aaa') 14 : ('aaaaaaaaaaaa', 'aa') 15 : ('aaaaaaaaaaaaaa', 'a') 16 : ('aaaaaaaaaaaaaaaa', '') 

which, I suspect, will help you figure it out. the short answer is that recursive descent is a very specific, limited thing. This is not a complete search.

(this is actually a good question. makes an important point. good book.)

+1
source

Analysis procedure aa

Aaaa analysis procedure

Aaaaaa analysis procedure

Aaaaaaaa analysis procedure

The recursive parser is reset only when errors occur. He missed a situation where success was "temporary."

0
source

Source: https://habr.com/ru/post/948651/


All Articles