How to create a character table

We have a purpose for creating a compiler. We have already done lexical and parsing, but we are stuck in generating intermediate code. We realized that we need to implement a symbol table in order to proceed with the creation of intermediate code, and we do not know how to do it and what it contains.

Given the code below, what should the character table contain? (The code is written in the training language, which is described below)

Also, how can we implement the applications in our symbol table?

<PROGRAM> ::= PROGRAM ID <BLOCK> ENDPROGRAM <BLOCK> ::= {<DECLARATIONS> <SUBPROGRAMS> <SEQUENCE>} <DECLARATIONS> ::= ε | DECLARE <VARLIST> ENDDECLARE <VARLIST> ::= ε | ID ( , ID )* <SUBPROGRAMS> ::= ( <PROCORFUNC> ) * <PROCORFUNC> ::= PROCEDURE ID <PROCORFUNCBODY> ENDPROCEDURE | FUNCTION ID <PROCORFUNCBODY> ENDFUNCTION <PROCORFUNCBODY> ::= <FORMALPARS> <BLOCK> <FORMALPARS> ::= ε | ( <FORMALPARLIST> ) <FORMALPARLIST> ::= <FORMALPARITEM> ( , <FORMALPARITEM> )* <FORMALPARITEM> ::= IN ID | INOUT ID <SEQUENCE> ::= <STATEMENT> ( ; <STATEMENT> )* <STATEMENT> ::= ε | <ASSIGNMENT-STAT> | <IF-STAT> | <WHILE-STAT> | <FOR-STAT> | <EXIT-STAT> | <CALL-STAT> | <RETURN-STAT> <ASSIGNMENT-STAT> ::= ID := <EXPRESSION> <IF-STAT> ::= IF <CONDITION> THEN <SEQUENCE> <ELSEPART> ENDIF <ELSEPART> ::= ε | ELSE <SEQUENCE> <WHILE-STAT> ::= DO {<SEQUENCE>} WHILE (<CONDITION>) <FOR-STAT> ::= (<ASSIGNMENT-STAT>; <CONDITION>;<ASSIGNMENT-STAT>;) {<SEQUENCE>} <EXIT-STAT> ::= EXIT <CALL-STAT> ::= CALL ID <ACTUALPARS> <ACTUALPARS> ::= ( <ACTUALPARLIST> ) | ε <ACTUALPARLIST> ::= <ACTUALPARITEM> ( , <ACTUALPARITEM> )* <ACTUALPARITEM> ::= IN <EXPRESSION> | INOUT ID <RETURN-STAT> ::= RETURN <EXPRESSION> <CONDITION> ::= <BOOLTERM> (OR <BOOLTERM>)* <BOOLTERM> ::= <BOOLFACTOR> (AND <BOOLFACTOR>)* <BOOLFACTOR> ::= NOT [<CONDITION>] | [<CONDITION>] | <EXPRESSION> <RELATIONAL-OPER> <EXPRESSION> | TRUE | FALSE <EXPRESSION> ::= <OPTIONAL-SIGN> <TERM> ( <ADD-OPER> <TERM>)* <TERM> ::= <FACTOR> (<MUL-OPER> <FACTOR>)* <FACTOR> ::= CONSTANT | (<EXPRESSION>) | ID <IDTAIL> <IDTAIL> ::= ε | <ACTUALPARS> <RELATIONAL-OPER> ::= = | < ( ε | = | > ) | > ( ε | = ) <ADD-OPER> ::= + | - <MUL-OPER> ::= * | / <OPTIONAL-SIGN> ::= ε | <ADD-OPER> 
 PROGRAM MULTIPLY { DECLARE A, B, C ENDDECLARE PROCEDURE Aop(INOUT A) { A=A+1; } ENDPROCEDURE FUNCTION Bop(IN B){ IF [NOT[[TRUE AND FALSE]OR[TRUE]]] THEN B := 100 / 2; ELSE B := 100; ENDIF; RETURN B; } ENDFUNCTION CALL Aop(INOUT A); CALL Bop(IN B); A := 40; C := A * B; } ENDPROGRAM 
+6
source share
1 answer

The symbol table displays identifiers (usually with a domain name prefix) for information about this identifier, for example, its symbol type (local variable / parameter / function / class, etc.), data type, its order relative to other identifiers in the same volume , in its original code line, etc. A symbol table can be generated by traversing the abstract syntax tree, always keeping track of which area you are in, and adding information to the symbol table whenever you click on a variable declaration. In your example, a part of the symbol table may look like this (matching the symbol type, data type, position, and line of the source code):

 MULTIPLY.A -> {"LOCAL", "INT", 0, 4} MULTIPLY.B -> {"LOCAL", "INT", 1, 4} MULTIPLY.C -> {"LOCAL", "INT", 2, 4} MULTIPLY.Aop -> {"FUNCTION", "INT", 3, 4} MULTIPLY.Aop.A -> {"INOUTPARAM", "INT", 0, 6} 

Now you can resolve all references to variables. For example, in the expression A := A + 1 , if you know that your current scope is MULTIPLY.Aop , the MULTIPLY.Aop table will let you know that this A is an INT input / output parameter and that it is the first parameter (this information will allow you generate a stack address offset so you can load / save the variable).

+6
source

Source: https://habr.com/ru/post/903908/


All Articles