How to match any character in the ANTLR parser (not lexer)?

How to match any character in the ANTLR parser (not lexer)? Where is the full language description for ANTLR4 parsers?

UPDATE

Is the answer "impossible"?

+6
source share
3 answers

First you need to understand the roles of each part in parsing:

Vocabulary: is an object that tokens your input string. Toxinization means converting a stream of input characters into an abstract token character (usually just a number).

Parser: This is an object that works only with tokens to determine the structure of the language. A language (written as one or more grammar files) defines valid combinations of tokens.

As you can see, the parser does not even know what a letter is. He knows only tokens. So your question is already wrong. He likes to ask how to cut individual atoms with a chainsaw.

Having said that, this will probably help to find out why you want to skip individual letters of input in your parser. It looks like your basic concept needs some tweaking.

+5
source

It depends on what you mean by the symbol. To match any token inside the parser rule, use the meta tag . (DOT) char. If you are trying to match any character inside the parser rule, you're out of luck; there is a strict separation between the parser and lexer rules in ANTLR. Cannot match character inside parser rule.

+4
source

This is possible, but only if you have such a basic grammar that the reason for using ANTlr is in any case denied.

If you have a grammar:

 text : ANY_CHAR* ; ANY_CHAR : . ; 

he will do what you (it seems) need.

However, as many have pointed out, this would be rather strange. The goal of a lexer is to identify the different tokens that can be combined into a parser to form a grammar, so your lexer can either identify a particular “JSTL / EL” string as a token, or [AZ] / EL, [AZ] ' / '[AZ] [AZ] etc. - depending on what you need.

Then the parser is used to determine the grammar, therefore:

 phrase : CHAR* jstl CHAR* ; jstl : JSTL SLASH QUALIFIER ; JSTL : 'JSTL' ; SLASH : '/' QUALIFIER : [AZ][AZ] ; CHAR : . ; 

will accept the entry "blah blah JSTL / EL ..." but not "blah blah ELST / JSTL ...".

I would advise taking a look at The Definitive ANTlr 4 Reference, in particular the Islands in the Stream section and Grammar Reference (Ch 15), which specifically deals with Unicode.

+2
source

Source: https://habr.com/ru/post/945205/


All Articles