The standard answer to the question of how to create parsers (which build AST) is to read standard texts at compilation. The compiler book Aho and Ullman's Dragon is pretty classic. If you do not have the patience to get the best reference materials, you will have more problems because they provide a theory and explore the intricacies. But here is my answer for people who are in a hurry, build recursive descent guerrillas.
You can create parsers with integrated error correction. There are many articles on this subject, a hot topic in the 1980s. Check out Google Scholar for a "syntax fix." The main idea is that the parser, faced with a parsing error, skips some well-known beacon (";" the instruction separator is quite popular for C-like languages, so you are asked in the comment if your language has operator terminators) or offers various deletions or inserts of the input stream to rise above the point of syntax error. The sheer variety of such patterns is amazing. The basic idea is to take into account as much error information as possible. One of the most intriguing ideas I've ever seen consisted of two parsers, one of which ran N tokens ahead of each other, looking for syntax errors in land mines, and the second parser - fixing loading errors based on N tokens available before of how he encounters a syntax error. This allows the second parser to act differently before proceeding with a syntax error. If you do not have this, most parsers throw out the left context and thereby lose the ability to correct. (I have never implemented such a scheme.)
The choice of things to insert can often be obtained from the information used to create the parser (often First and Follow ). This is relatively easy to do with L (AL) R parsers, because the parsing tables contain the necessary information and are accessible to the parser at the point where it encounters an error. If you want to understand how to do this, you need to understand the theory (about whether there is a compiler in this book), how parsers are created. (I have successfully implemented this circuit several times).
Regardless of what you do, correcting syntax errors does not help much, because it is almost impossible to guess what the writer of the analyzed document actually intended. This suggests that fancy schemes will not be really useful. I keep simple; people are happy to receive a bug report and some semi-graceful continuation of the parsing.
The real problem with translating your own analyzer into a real language is that real languages ββare unpleasant messy things; people building real implementations are mistaken and stumbling over existing code bases or insist on bending / improving the language (standards for weaknesses, goodies for marketing), because it's cool. Expect to spend a lot of time recalibrating what you think is grammar against the truth of the true code. As a rule, if you want to work with the parser, it is better to get one that has a track record, and not roll it yourself.
Lesson Most people who have struggled to build a parser do not get it, is that if they want to do something useful with a result or analysis tree, they will need much more basic equipment than just a parser. Check my bio on " Life After Parsing. "