People tried to figure out how to report and correct syntax errors from the first. There are many technical documents on how to do this. The hunt for the line "syntax correction" on scholar.google.com gives 57 views.
There are several problems:
1) How to report a significant error to the reader. To begin with, where the parser detects an error and where the user actually made the Error. For example, a C program might have a “++” operator in a strange place:
void p { x = y ++ z = 0; <EOF>
Most parsers will throttle when the "z" is met, and report it as the place of the error. However, if the error uses “++” when “+” was intended, this report is incorrect. Unfortunately, getting this right requires you to read the thoughts of a programmer.
You also have a problem with the error message. Are you reporting the error as an expression [at first glance it seems so]? in the statement? In line? In a functional body? In function declaration? You probably want to report in the narrowest syntax category that can surround the error point. (Note that you cannot tell the body or function declaration as the "environment" of the error point because they are not complete either!) What if the error was really a missing semicolon after ++? Then the places of errors were not really “in expression”. What should I do if repair requires the insertion of a missing line? Macro Continuation Symbol?
So, you need to somehow decide what constitutes the actual error, and this leads to the correction of errors.
2) Error repair: in order for the tool to work in a meaningful way, it must eliminate the error. Presumably this means fixing the flow of input tokens to create a legal program (which you may not be able to do if the source has several errors). What if there are several possible patches? It should be obvious that the best error report is "yyyy is wrong, I suspect you should use xxxx". How big a patch should be considered for repairs: only the token that caused the error, the tokens that follow it, what about the tokens that precede it?
I note that it is difficult to make an automatic general error correction proposal for handwritten parsers, because the grammar needed to guide such repairs is clearly not available anywhere. Therefore, you expect auto repair to work better with tools for which grammar was a clear artifact.
It is also possible that when correcting errors, common errors should be considered. If people tend to leave ';' turning off and pasting one file fix, it can be a good repair. If they rarely do this, and there is more than one repair (for example, replace "++" with "+), an alternative repair is probably the best deal.
3) The semantic effect of repair. Even if you correct syntax errors, the corrected program may be unreasonable. If your mistake requires inserting an identifier, which identifier should I use?
FWIW, our DMS Software Reengineering Toolkit, performs automated grammar-driven repairs. It works under the assumption that the token at the point of error should be deleted or that some other single token should be inserted into it on the left. This is not enough; and additional plus signs; often succeeds in legal repair. Often this is not "right." At the very least, it allows the analyzer to go to the rest of the source code.
I think that the hunt for a good automatic error correction will continue for a long time.
FWIW, "Syntax recovery error" article for Java-based parser generator reports that Burke Ph.D. Thesis:
MG Burke, 1983, Practical Method for Diagnosing and Recovering Syntax Errors LR and LL, Ph.D., Department of Computer Science, New York University.
pretty good. In particular, it corrects errors by examining and revising the left error context, as well as the error area. It seems to get it from ACM