In c split char * in spaces with strtok function, except when between quotation marks

Is there a way to do this with the strtok ? or any suggestions?

Example:

 Insert "hello world" to dbms 

Result:

 Insert "hello world" to dbms 
+5
source share
6 answers

Cannot use strtok() .

An exciting opportunity to use a state machine .

 #include <stdio.h> void printstring(const char *frm, const char *to) { fputc('<', stdout); // <...>\n Added for output clarity while (frm < to) { fputc(*frm++, stdout); } fputc('>', stdout); fputc('\n', stdout); } void split_space_not_quote(const char *s) { const char *start; int state = ' '; while (*s) { switch (state) { case '\n': // Could add various white-space here like \f \t \r \v case ' ': // Consuming spaces if (*s == '\"') { start = s; state = '\"'; // begin quote } else if (*s != ' ') { start = s; state = 'T'; } break; case 'T': // non-quoted text if (*s == ' ') { printstring(start, s); state = ' '; } else if (*s == '\"') { state = '\"'; // begin quote } break; case '\"': // Inside a quote if (*s == '\"') { state = 'T'; // end quote } break; } s++; } // end while if (state != ' ') { printstring(start, s); } } int main(void) { split_space_not_quote("Insert \"hello world\" to dbms"); return 0; } <Insert> <"hello world"> <to> <dbms> 
+2
source

strtok or any other function in the standard C library cannot do this for you. To get it, you need to write the code for it yourself, or you need to find some existing code in some external library.

+6
source

This function accepts markup, open and close lock symbols. Separators are ignored inside the block, and the characters of the closing block must match the characters of the starting block. An example is split into space, and blocks are defined by quotation marks and brackets, curly braces and <>. Thanks to jongware for comments!

 #include<stdlib.h> #include<stdio.h> #include<string.h> char *strmbtok ( char *input, char *delimit, char *openblock, char *closeblock) { static char *token = NULL; char *lead = NULL; char *block = NULL; int iBlock = 0; int iBlockIndex = 0; if ( input != NULL) { token = input; lead = input; } else { lead = token; if ( *token == '\0') { lead = NULL; } } while ( *token != '\0') { if ( iBlock) { if ( closeblock[iBlockIndex] == *token) { iBlock = 0; } token++; continue; } if ( ( block = strchr ( openblock, *token)) != NULL) { iBlock = 1; iBlockIndex = block - openblock; token++; continue; } if ( strchr ( delimit, *token) != NULL) { *token = '\0'; token++; break; } token++; } return lead; } int main (int argc , char *argv[]) { char *tok; char acOpen[] = {"\"[<{"}; char acClose[] = {"\"]>}"}; char acStr[] = {"this contains blocks \"a [quoted block\" and a [bracketed \"block] and <other ]\" blocks>"}; tok = strmbtok ( acStr, " ", acOpen, acClose); printf ( "%s\n", tok); while ( ( tok = strmbtok ( NULL, " ", acOpen, acClose)) != NULL) { printf ( "%s\n", tok); } return 0; } 

exit
this is
contains
image blocks "a [cited block

and

block in square brackets

and

+4
source

Perhaps you can use a regular expression (i.e. regular expressions in C: examples )

Here is an example of a regular expression that you can use: /([\w]+)|(\"[\w\ ]+\")/gi

To train with regular expression, you should also use: http://regex101.com/

0
source

You can make the first pass where strtok breaks the string using the quotation mark as a separator. Then do a second pass with a space character as a separator in the result lines that were not specified.

Edited to add working source code:

 bool quotStr = (*stringToSplit == '\"'); char* currQuot = strtok(stringToSplit, "\""); char* next = NULL; while(currQuot) { if(quotStr) { printf("\"%s\"\n", currQuot); quotStr = false; } else { // remember where the outer loop strtok left off next = strtok(next, "\0"); // subdivide char* currWord = strtok(currQuot, " "); while(currWord) { printf("%s\n", currWord); currWord = strtok(NULL, " "); } quotStr = true; } currQuot = strtok(next, "\""); next = NULL; } 

I believe that this will still fail in the case of empty quoted strings, though ...

0
source

My solution with strtok (). It groups only words that begin with Space-Quotes and end with Quotes-Space

 void split(char *argstring) { int _argc = 0; char **_argv = malloc(sizeof(char*)); char *token; int myFlag = 0; for(token = strtok(argstring, " "); token != NULL; token = strtok(NULL, " ")) { if (1 == myFlag) { //One of the previous token started with double quotes if ('\"' == token[strlen(token)-1]) myFlag = 0; //This token ends with double quotes _argv[_argc-1] = realloc(_argv[_argc-1], strlen(_argv[_argc-1]) + strlen(token) + 2); //Enlarge the previous token strcat(_argv[_argc-1], " "); strcat(_argv[_argc-1], token); } else { if ('\"' == token[0]) myFlag = 1; //This token starts with double quotes _argv = realloc(_argv, (_argc + 1) * sizeof(char*)); //Add one element to the array of strings _argv[_argc] = m2m_os_mem_alloc(strlen(token) + 1); //Allocate the memory for the Nth element strcpy(_argv[_argc], token); //Copy the token in the array _argc++; } } do { m2m_os_mem_free(_argv[_argc--]); } while (_argc >= 0); } 
0
source

Source: https://habr.com/ru/post/1203971/


All Articles