Parsing Java code with ANTLR "need for concept"

I am trying to compile programs using ANTLR, and I use the Java programming language as the goal, and the core of the problem is to develop a Regentizer Intent to fix bugs and improve the source code if the source code is not in grammar. on textbooks and books on ANTLR. I see how to compile simple code with the assumption that the lexer and parser I did, and the source code:

int main(){ int a,b; c=20; } 

how can a program detect errors that the previously unknown variable 'C' did not declare?

I tried to apply it by following the compilation instructions using ANTLR, but the code for the ANTLR generator is considered valid because it complies with the rules of the grammar of the expression. but in fact the variable c is unknown.

or how to make a grammar that can implement object-oriented concepts in it? I tried using ANTLR grammar, but the result still does not explain the concept of OOP.

 public class Hello { } public class HelloTwo { Hello hl = new HelloWrong(); } 

If I compile the code, the result will be valid, because according to Grammar.but, look that the HelloWrong class really is not. This also involves writing the previous variable in my first tasks.

Sorry, my English. Hope you can help me. thanks to you

+3
source share
1 answer

Whether "c" was declared is not part of the grammar.
The parser outputs an Abstract syntax tree , the compiler accepts this AST and semantic analysis . At this point, compiler errors are generated, such as "the variable c does not exist in this area."

ANTLR produces AST for you, and then it is done. The next stage (semantic analysis and compilation and generation of the executable file) is performed by another part of the compiler.


The method I used to create the behavior you are looking for is to go through an AST analysis of "semantic analysis" on each node. What AST looks like depends entirely on the grammar used to create it, but your first program might look like this:

 PROGRAM |- FUNCTION_DEC "main" |- ARGS <none> |- SCOPE 1 |- LOCAL_DEC "a", "b" |- EXPRESSION_STMT |- ASSIGNMENT |- VARIABLE "c" |- LITERAL 20 

And semantic analysis could do something like this:
1) Add "main" to the symbol table as a globally accessible function
2) Add scope 1 to the scope of the main function in the symbol table
3) Add "a" and "b" to the character table as local variables inside scope 1
4) Look at the symbol table for the variable "c" inside area 1, fail, look at the parent area "main", fail, look at the global area, fail, create an error message: variable "c" was not found.

This is a fairly typical process, as far as I know.

+4
source

Source: https://habr.com/ru/post/1396867/


All Articles