Alpha character separation in C ++ STL

I will train C ++ for competitions next week. And in the sample problem I'm working on, paragraphs are divided into words. Of course, this is easy. But this problem is so strange that words like: isn'tmust also be separated: isnand t. I know this is strange, but I have to follow this.

I have a function split()that takes a separator constant charas one of the parameters. This is what I use to separate words from spaces. But I can’t figure it out. Even numbers like: phil67bsmust be separated as philwell bs.

And no, I am not asking for the full code. Pseudocode will do, or something that will help me understand what to do. Thank!

PS: Please, no recommendations for external libraries. Just STL. :)

+3
source share
5 answers

Filter out numbers, spaces, and anything else that is not a letter using the correct locale. See this SO thread about handling everything except numbers, like spaces. So use mask and do something similar to what Jerry Coffin offers, but only for letters:

struct alphabet_only: std::ctype<char> 
{
    alphabet_only(): std::ctype<char>(get_table()) {}

    static std::ctype_base::mask const* get_table()
    {
        static std::vector<std::ctype_base::mask> 
            rc(std::ctype<char>::table_size,std::ctype_base::space);

        std::fill(&rc['A'], &rc['['], std::ctype_base::upper);
        std::fill(&rc['a'], &rc['{'], std::ctype_base::lower);
        return &rc[0];
    }
};

And boom! You are golden.

Or ... you could just do the conversion:

char changeToLetters(const char& input){ return isalpha(input) ? input : ' '; }

vector<char> output;
output.reserve( myVector.size() );
transform( myVector.begin(), myVector.end(), insert_iterator(output), ptr_fun(changeToLetters) );

Which, um, is much easier to disassemble, not as effective as Jerry's idea.

Edit:

Changed "Z" to "[", so the value of "Z" is filled. Similarly from 'z' to '{'.

+4
source

find_first_of, . .

:

size_t previous = 0;
for (; ;) {
    size_t next = str.find_first_of(" '1234567890", previous);
    // Do processing
    if (next == string::npos)
        break;
    previous = next + 1;
};
+1

, . - , ?

: , "" . ; . . .

0

- :

vector<string> split(const string& str)
{
    vector<string> splits;

    string cur;
    for(int i = 0; i < str.size(); ++i)
    {
        if(str[i] >= '0' && str[i] <= '9')
        {
            if(!cur.empty())
            {
                splits.push_back(cur);
            }
            cur="";
        }
        else
        {
            cur += str[i];
        }
    }
    if(! cur.empty())
    {
        splits.push_back(cur);
    }

    return splits;

}
0

let's say the input is in std::string(use std::getline(cin, line), for example, to read the full line from cin)

std::vector<std::string> split(std::string const& input)
{
  std::string::const_iterator it(input), end(input.end());
  std::string current;
  vector<std::string> words;
  for(; it != end; ++it)
  {
    if (isalpha(*it))
    { 
      current.push_back(*it); // add this char to the current word
    }
    else
    {
      // push the current word in to the result list
      words.push_back(current);
      current.clear(); // next word
    }
  }
  return words;
}

I have not tested it, but I think it should work ...

0
source

Source: https://habr.com/ru/post/1786667/


All Articles