I use 3 classes: the character class, the Scanner class, and the Test class.
This is a character class:
public class Character {
private char cargo = '\u0007';
private String sourceText = "";
private int sourceIndex = 0;
private int lineIndex = 0;
private int columnIndex = 0;
public Character(String sourceText, char cargo, int sourceIndex, int lineIndex, int columnIndex) {
this.sourceText = sourceText;
this.cargo = cargo;
this.sourceIndex = sourceIndex;
this.lineIndex = lineIndex;
this.columnIndex = columnIndex;
}
@Override
public String toString() {
switch (cargo) {
case ' ': return String.format("%6d %-6d " + " blank", lineIndex, columnIndex);
case '\t': return String.format("%6d %-6d " + " tab", lineIndex, columnIndex);
case '\n': return String.format("%6d %-6d " + " newline", lineIndex, columnIndex);
default: return String.format("%6d %-6d " + cargo, lineIndex, columnIndex);
}
}
}
Here is my scanner class:
public class Scanner {
private String sourceText = "";
private int sourceIndex = -1;
private int lineIndex = 0;
private int columnIndex = -1;
private int lastIndex = 0;
public Scanner(String sourceText) {
this.sourceText = sourceText;
lastIndex = sourceText.length() - 1;
}
public Character getNextCharacter() {
if (sourceIndex > 0 && sourceText.charAt(sourceIndex - 1) == '\n') {
++lineIndex;
columnIndex = -1;
}
++sourceIndex;
++columnIndex;
char currentChar = sourceText.charAt(sourceIndex);
Character objCharacter = new Character(sourceText, currentChar, sourceIndex, lineIndex, columnIndex);
return objCharacter;
}
}
And this is the main method of the Test class:
public static void main(String[] args) {
String sourceText = "";
String filePath = "D:\\Somepath\\SampleCode.dat";
try { sourceText = readFile(filePath, StandardCharsets.UTF_8); }
catch (IOException io) { System.out.println(io.toString()); }
LexicalAnalyzer.Scanner sca = new LexicalAnalyzer.Scanner(sourceText);
LexicalAnalyzer.Character cha;
int i =0;
while(i < sourceText.length()) {
cha = sca.getNextCharacter();
System.out.println(cha.toString());
i++;
}
}
Basically, I try to print every character (including spaces, tabs and newlines) in my source file, as well as other character data such as line number and column number. Also pay attention to my switch and case statement in the method toString()of the Character class.
Say, for example, my file contains text:
This is line
This is line
From my code, I expect to get:
0 0 T
0 1 h
0 2 i
0 3 s
0 4 blank
0 5 i
0 6 s
0 7 blank
0 8 l
0 9 i
0 10 n
0 11 e
0 12 blank
0 13 #
0 14 1
0 15 .
0 16 newline
1 0 T
1 1 h
1 1 i
1 2 s
1 3 blank
1 4 i
1 5 s
1 6 blank
1 7 l
1 8 i
1 9 n
1 10 e
1 11 blank
1 12 #
1 13 2
1 14 .
However, I get:
0 0 T
0 1 h
0 2 i
0 3 s
0 4 blank
0 5 i
0 6 s
0 7 blank
0 8 l
0 9 i
0 10 n
0 11 e
0 12 blank
0 13 #
0 14 1
0 15 .
0 16
0 17 newline
0 18 T
1 0 h
1 1 i
1 2 s
1 3 blank
1 4 i
1 5 s
1 6 blank
1 7 l
1 8 i
1 9 n
1 10 e
1 11 blank
1 12 #
1 13 2
1 14 .
, , . . , , . , Java: http://parsingintro.sourceforge.net/#contents_item_4.2.
, . .
%n String.format System.getProperty("line.separator"); . : , ?