When you are only interested in the part of the file that you are parsing, you do not need a parser and write grammar for the entire file format. Just lexer grammar and ANTLR options{filter=true;} enough. This way you only get the markers that you defined in your grammar and ignore the rest of the file.
Here is a quick demo:
lexer grammar TestLexer; options{filter=true;} @lexer::members { public static void main(String[] args) throws Exception { String text = "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function1(selected=[\"red\",\"yellow\"]){}\n"+ "\n"+ "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function2(selected=[\"Rocket\"]){}\n"+ "\n"+ "7114422 2009-07-16 15:43:07,078 [LOGTHREAD] INFO StatusLog - Task 0 input : uk.project.Evaluation.Input.Function3(selected=[\"blue\",\"yellow\"]){}\n"+ "\n"+ "7114437 2009-07-16 15:43:07,093 [LOGTHREAD] INFO StatusLog - Task 0 output : uk.org.project.Evaluation.Output.Function4(selected=[\"Speech\"]){}"; ANTLRStringStream in = new ANTLRStringStream(text); TestLexer lexer = new TestLexer(in); CommonTokenStream tokens = new CommonTokenStream(lexer); for(Object obj : tokens.getTokens()) { Token token = (Token)obj; System.out.println("> token.getText() = "+token.getText()); } } } Input : 'Evaluation.Input.Function' '0'..'9'+ Params ; Output : 'Evaluation.Output.Function' '0'..'9'+ Params ; fragment Params : '(selected=[' String ( ',' String )* '])' ; fragment String : '"' ( ~'"' )* '"' ;
Now do:
javac -cp antlr-3.2.jar TestLexer.java java -cp .:antlr-3.2.jar TestLexer // or on Windows: java -cp .;antlr-3.2.jar TestLexer
and you will see that the following data will be printed to the console:
> token.getText() = Evaluation.Input.Function1(selected=["red","yellow"]) > token.getText() = Evaluation.Output.Function2(selected=["Rocket"]) > token.getText() = Evaluation.Input.Function3(selected=["blue","yellow"]) > token.getText() = Evaluation.Output.Function4(selected=["Speech"])
source share