How can I do this work with each separator? - C ++

I just wrote a program that tokenizes a char array using pointers. The program needed to work only with space as a delimiter character. I just turned it on and got full credit, but after turning it on I realized that this program only worked if the delimiter character was a space.

My question is: how can I make this program work with every delimiter character?

The function below returns a pointer to the next word in a char array. This is what I consider necessary to change if it should work with all separator characters.

Thanks!

the code:

char* StringTokenizer::Next(void) { pNextWord = pStart; if (*pStart == '\0') { return NULL; } while (*pStart != delim) { pStart++; } if (*pStart == '\0') { return NULL; } *pStart = '\0'; pStart++; return pNextWord; } 

Printing cycle mainly:

 // this loop will display the tokens while ( ( nextWord = tk.Next ( ) ) != NULL ) { cout << nextWord << endl; } 
+4
source share
5 answers

The easiest way is to change

 while (*pStart != delim) 

to something like

 while (*pStart != ' ' && *pStart != '\n' && *pStart != '\t') 

Or you can make a delim string and create a function that checks if char is in the string:

 bool isDelim(char c, const char *delim) { while (*delim) { if (*delim == c) return true; delim++; } return false; } while ( !isDelim(*pStart, " \n\t") ) 

Or perhaps the best solution is to use one of the pre-created functions to do all this, like strtok for example.

+1
source

Just change

 while (*pStart != delim) 

to this line

 while (*pStart != '\0' && strchr(" \t\n", *pStart) == NULL) 

The standard strchr function (declared in the string.h header) looks for a character (specified as the second argument) in the C-string (given as the first argument) and returns a pointer to the string from the position where this character is first. So strchr(" \t\n", *pStart) == NULL means that the current character ( *pStart ) was not found in the string " \t\n" , and this is not a separator! (Change this delimiter string " \t\n" to, of course, adapt it to your needs.)

This solution is a short and simple way to check whether a character is set in a set (usually small) of specified interesting characters. And it uses a standard feature.

By the way, you can do this using not only a C-string, but also std::string . All you need to do is declare const std::string to be " \t\n" -like, and then replace strchr with the find method of the declared separator string.

+1
source

Hmm ... that doesn't look quite right:

 if (*pStart = '\0') 

A condition can never be true. I assume you intended == instead of = ? You also have a little problem:

 while (*pStart != delim) 

If the last word in the line does not match the delimiter, this will lead to the end of the line, which will cause serious problems.

Edit: if you really don't need to do this yourself, consider using a string stream for the job. It already has all the necessary mechanisms and is pretty much tested. This adds overhead, but it is quite acceptable in many cases.

0
source

Not compiled. but I would do something like this.

  //const int N = someGoodValue; char delimList[N] = {' ',',','.',';', '|', '!', '$', '\n'};//all delims here. char* StringTokenizer::Next(void) { if (*pStart == '\0') { return NULL; } pNextWord = pStart; while (1){ for (int x = 0; x < N; x++){ if (*pStart == delimList[x]){ //this is it. *pStart = '\0'; pStart++; return pNextWord; } } if ('\0' == *pStart){ //last word.. maybe. return pNextWord; } pStart++; } } // (!compiled). 
0
source

I guess we want to stick with C instead of C ++. The strspn and strcspn are good for tokenizing with a set of delimiters. You can use strspn to find where the next separator begins (i.e. where the current token ends), and then use strcspn to find where the separator ends (i.e. where the next token begins). Loop until you get to the end.

0
source

Source: https://habr.com/ru/post/1300582/


All Articles