Regex for string using GNU C regex library

I am writing a regex for use with the GNU C regex library:

The line has the form: (the text in italics is a description of the content)

(NOT a #) start (possibly a space): data p>

I wrote the following code, but it will not match.

regcomp(&start_state, "^[^#][ \\t]*\\(start\\)[ \\t]*[:].*$", REG_EXTENDED); 

What do I need to write?

examples: to match:

: q0
state: q0
condition: Q0S

not match:

#state: q0
state q0
# state: q0

Thanks!

+4
source share
3 answers

The sample in your question consumed the first letter in state with [^#] , which left a match impossible, because it is trying to match the tate with the pattern \(state\) .

You passed the REG_EXTENDED flag, which means you do not avoid capturing parentheses, but avoiding alphabetic parentheses.

Using regular expressions, say what you want:

 ^[ \\t]*(state)[ \\t]*:.*$ 

how in

 #include <stdio.h> #include <regex.h> int main(int argc, char **argv) { struct { const char *input; int expect; } tests[] = { /* should match */ { "state : q0", 1 }, { "state: q0", 1 }, { "state:q0s", 1 }, /* should not match */ { "#state :q0", 0 }, { "state q0", 0 }, { "# state :q0", 0 }, }; int i; regex_t start_state; const char *pattern = "^[ \\t]*(state)[ \\t]*:.*$"; if (regcomp(&start_state, pattern, REG_EXTENDED)) { fprintf(stderr, "%s: bad pattern: '%s'\n", argv[0], pattern); return 1; } for (i = 0; i < sizeof(tests)/sizeof(tests[0]); i++) { int status = regexec(&start_state, tests[i].input, 0, NULL, 0); printf("%s: %s (%s)\n", tests[i].input, status == 0 ? "match" : "no match", !status == !!tests[i].expect ? "PASS" : "FAIL"); } regfree(&start_state); return 0; } 

Conclusion:

  state: q0: match (PASS)
 state: q0: match (PASS)
 state: q0s: match (PASS)
 #state: q0: no match (PASS)
 state q0: no match (PASS)
 # state: q0: no match (PASS) 
+7
source

OK I understood:

 regcomp(&start_state, "^[^#]*[ \\t]*start[ \\t]*:.*$", REG_EXTENDED); 

above solves my problem! (it turns out I forgot to put * after [^ #]) ...

Thanks for your help, Rubens! :)

+1
source

This works with your data:

 ^[^#]\s*\w+\s*:(?<data>.*?)$ 

EDIT : I don't know, but you need to enable multi-line support, since the first ^ and last $ have different behavior with this parameter.

0
source

Source: https://habr.com/ru/post/1300321/


All Articles