SIC Assembler Source Tokenization

I pretty much finished coding the SIC assembler for my system programming class, but I'm at a dead end on the tokenizing part.

For example, take this line of source code:

Format (free format): {LABEL} OPCODE {OPERAND {, X}} {COMMENT}

Curls indicate that the field is optional.

In addition, each field must be separated by at least one space or tab.

ENDFIL      LDA     EOF         COMMENT GOES HERE

The code above is a little easier to organize, but the following snippet gives me difficulties.

        RSUB                COMMENT GOES HERE

My code will be read in the first word of the comment, as if it were OPERAND.

Here is my code:

//tokenize line
    if(currentLine[0] != ' ' && currentLine[0] != '\t')
    {
        stringstream stream(currentLine);
        stream >> LABEL;
        stream >> OPCODE;
        stream >> OPERAND;
        stream.str("");


        if(LABEL.length() > 6 || isdigit(LABEL[0]) || !alphaNum(LABEL))
        {
            errors[1] = 1;
        }
        else if(LABEL.length() == currentLine.length())
        {
            justLabel = true;
            errors[6] = 1;
            return;
        }
    }
    else
    {
        stringstream stream(currentLine);
        stream >> OPCODE;
        stream >> OPERAND;
        stream.str("");
    }

My professor requires assembler to be tested with two versions of the source code - with and without errors.

OPCODE RSUB OPERAND, , RSOB OPCODE , OPERAND OPCODE, OPERAND OPERAND, ? OPERAND ( ).

: , OPERAND?

+3
2

( ), , , : , :

ENDFIL LDA EOF ;COMMENT GOES HERE
RSUB ;ANOTHER COMMENT GOES HERE

, - , , . , ( ) ?

{LABEL}<whitespace>OPCODE<whitespace>{OPERAND{,X}}<whitespace>{COMMENT}
-1

, ? ? , OPCODE "RSUB", , OPERAND ? OPERAND , OPCODE:

if (OPCODE == "RSUB") OPERAND.clear();
-1

Source: https://habr.com/ru/post/1709053/


All Articles