As others have shown you, using sscanf for this purpose is not recommended. A case that he cannot catch is "optional" ( which may or may not appear between " and X If scanf , if there is an optional field that does not have any separator to indicate that it is missing, then the only way to determine it is missing is to try to parse it, notice that it is not there, and try to analyze it again using another line of the scan format.
parsed = sscanf( formula, " \" %*[(] X%*[^>]>> %u %*[)] %c %n", &s, &del, &pos ); if (parsed != 2) { parsed = sscanf( formula, " \" X%*[^>]>> %u %c %n", &s, &del, &pos ); }
The rest of this solution describes how to use POSIX <regex.h> basic regular expressions to parse it.
First you need to define your regular expression and compile it.
const char *re = "[ \t]*\"" /* match up to '"' */ "[ \t]*(\\{0,1\\}[ \t]*" /* match '(' if present */ "X[ \t]*>>[ \t]*" /* match 'X >>' */ "\\([0-9][0-9]*\\)" /* match number as subexpression */ "[ \t]*)\\{0,1\\}[ \t]*" /* match ')' if present */ "\\(.\\)" /* match final delimiter as subexpression */ "[ \t\r\n]*"; /* match trailing whitespace */ regex_t reg; int r = regcomp(®, re, 0); if (r != 0) { char buf[256]; regerror(r, ®, buf, sizeof(buf)); fprintf(stderr, "regcomp: %s\n", buf); /*...*/ }
Now you will need to execute the expression against the string you want to match. The compiler will track the number of subexpressions in your regular expression and put that number in reg.re_nsub . However, there is an implicit subexpression that is not included in this account. This is a complete string that matches the provided expression. It always appears in the first match. So, when you create your corresponding array, keep this in mind. This is why the matches array has one more meaning than in reg.re_nsub .
unsigned match(const regex_t *preg, const char * formula){ int r; const int NSUB = preg->re_nsub + 1; regmatch_t matches[NSUB]; r = regexec(preg, formula, NSUB, matches, 0); if (r == 0) { parsed = preg->re_nsub; s = atoi(formula + matches[1].rm_so); del = formula[matches[2].rm_so]; pos = matches[0].rm_eo; } else { parsed = 0; }
When you are done with the regex, you must release it (if it was successfully compiled).
regfree(®);