Regular expression for string literal in flex / lex

I am experimenting to learn flex and would like to match string literals. Currently my code is as follows:

"\""([^\n\"\\]*(\\[.\n])*)*"\"" {/*matches string-literal*/;} 

I have been struggling with variations for an hour or so and cannot make it work as it should. I essentially hoped to match a string literal that cannot contain a new string (unless it is escaped) and supports escaped characters.

I'm probably just writing a bad regex or incompatible with flex. Please advise!

+41
c regex flex-lexer lex string-literals
Jan 11 '10 at 3:45
source share
5 answers
+53
Jan 11 '10 at 3:50
source share

The string consists of quotation marks.

 " 

followed by zero or more or anything escaping

 \\. 

or character without quotes

 [^"\\] 

and finally the final quote

 " 

Put it all together and you have

 \"(\\.|[^"\\])*\" 

Delimited quotes are escaped because they are Flex metacharacters.

+89
Jan 11 '10 at 3:53
source share

For one line ... you can use this:

 \"([^\\\"]|\\.)*\" {/*matches string-literal on a single line*/;} 
+16
Feb 13 '12 at 12:30
source share

How to use the initial state ...

 int enter_dblquotes = 0;

 % x DBLQUOTES
 %%

 \ "{BEGIN (DBLQUOTES); enter_dblquotes ++;}

 <DBLQUOTES> * \ " 
 { 
    if (enter_dblquotes) {
        handle_this_dblquotes (yytext); 
        BEGIN (INITIAL);  / * revert back to normal * /
        enter_dblquotes--; 
    } 
 }
          ... more rules follow ...

It looked like this effect (flex uses %s or %x to indicate which state will be expected. When the flex input detects a quote, it switches to another state and then continues lexing until it reaches another quote in which it returns in normal condition.

+8
Jan 11 '10 at 4:04
source share

An answer that comes late, but which may be useful for the next one he will need:

 \"(([^\"]|\\\")*[^\\])?\" 
0
Jun 03 '17 at 20:31 on
source share



All Articles