Scanf to read zero or more characters from a character set

I need to be very strict about the characters that may be on the read line.

I have a series of spaces followed by a character, followed by a series of spaces.
Examples: " c " , "c" , "" , " "

I need to find a format specifier that allows me to ignore a character, but only if it is this character, and not some other character. This sequence " e " must be interrupted.

I tried " %*[c] " , but my unittests fail in some scenarios, which made me believe that " %*[c] " looking for one or more 'c' instead of zero or more 'c' .

I wrote a mini example to better illustrate my problem. Keep in mind that this is just a minimal example. The central issue is how to analyze the number of zeros or one character.

 #include <stdio.h> #include <string.h> unsigned match(const char * formula){ unsigned e = 0, found = 0, s; char del; int parsed, pos, len = (int) strlen(formula); const size_t soc = sizeof( char ); del = ' '; parsed = sscanf_s( formula, " \" %*[(] X%*[^>]>> %u %*[)] %c %n", &s, &del, soc, &pos );// (X >> s ) if( ( 2 == parsed ) && ( pos == len) && ( '"' == del ) ){ printf("%6s:%s\n", "OK", formula); }else{ printf("%6s:%s\n", "FAIL", formula); e += 1; } return e; } unsigned main( void ) { unsigned e = 0; printf("SHOULD BE OK\n"); e += match(" \"X >> 3\""); //This one does not feature the optional characters e += match(" \"( X >> 3 ) \""); e += match(" \"( X >> 3 ) \"\r"); printf("SHOULD FAIL\n"); if ( 0 == match(" \"( Y >> 3 ) \"") ) e += 1; if ( 0 == match(" \"g X >> 3 ) \"") ) e += 1; if ( 0 == match(" \"( X >> 3.3-4.2 ) \"") ) e += 1; if( 0 != e ){ printf( "ERRORS: %2u\n", e ); } else{ printf( "all pass\n", e ); } return e; } 
+4
source share
1 answer

As others have shown you, using sscanf for this purpose is not recommended. A case that he cannot catch is "optional" ( which may or may not appear between " and X If scanf , if there is an optional field that does not have any separator to indicate that it is missing, then the only way to determine it is missing is to try to parse it, notice that it is not there, and try to analyze it again using another line of the scan format.

 parsed = sscanf( formula, " \" %*[(] X%*[^>]>> %u %*[)] %c %n", &s, &del, &pos ); if (parsed != 2) { parsed = sscanf( formula, " \" X%*[^>]>> %u %c %n", &s, &del, &pos ); } 

The rest of this solution describes how to use POSIX <regex.h> basic regular expressions to parse it.

First you need to define your regular expression and compile it.

 const char *re = "[ \t]*\"" /* match up to '"' */ "[ \t]*(\\{0,1\\}[ \t]*" /* match '(' if present */ "X[ \t]*>>[ \t]*" /* match 'X >>' */ "\\([0-9][0-9]*\\)" /* match number as subexpression */ "[ \t]*)\\{0,1\\}[ \t]*" /* match ')' if present */ "\\(.\\)" /* match final delimiter as subexpression */ "[ \t\r\n]*"; /* match trailing whitespace */ regex_t reg; int r = regcomp(&reg, re, 0); if (r != 0) { char buf[256]; regerror(r, &reg, buf, sizeof(buf)); fprintf(stderr, "regcomp: %s\n", buf); /*...*/ } 

Now you will need to execute the expression against the string you want to match. The compiler will track the number of subexpressions in your regular expression and put that number in reg.re_nsub . However, there is an implicit subexpression that is not included in this account. This is a complete string that matches the provided expression. It always appears in the first match. So, when you create your corresponding array, keep this in mind. This is why the matches array has one more meaning than in reg.re_nsub .

 unsigned match(const regex_t *preg, const char * formula){ /*...*/ int r; const int NSUB = preg->re_nsub + 1; regmatch_t matches[NSUB]; r = regexec(preg, formula, NSUB, matches, 0); if (r == 0) { /* success */ parsed = preg->re_nsub; s = atoi(formula + matches[1].rm_so); del = formula[matches[2].rm_so]; pos = matches[0].rm_eo; } else { parsed = 0; } /*...*/ 

When you are done with the regex, you must release it (if it was successfully compiled).

 regfree(&reg); 
+4
source

Source: https://habr.com/ru/post/1482430/


All Articles