Robust parsing of integers in C ++

I am trying to write a helper function that can be used to parse integers from configuration files and from a text protocol (written by a machine, not a person). I read How to parse a string to int in C ++? but the solutions there do not affect all the problems. I would like something to happen (from most to least important):

  • Reject values ​​out of range. strtoul and strtoull do not quite achieve this: when specifying a minus sign, the value is negated "in the return type". Thus, "-5" happily parses and returns 4294967291 or 18446744073709551611 instead of signaling an error.
  • Be in the C locale, regardless of the global language setting (or even better, give me a choice). If there is no way to set a global locale for each stream, this eliminates strtoul, stoul and boost :: lexical_cast and leaves only istringstream (where the locale can be inherited).
  • Be strict enough. It definitely should not accept trash, and ideally I would also like to ban free space. This immediately makes strtol and everything based on it is a bit problematic. It seems that istringstream can work here using noskipws and checking EOF, although this might just be a GCC bug.
  • Ideally, give some control over whether the base should be taken equal to 10 or should be inferred from the 0 or 0x prefix.

Any ideas for a solution? Is there an easy way to wrap an existing syntax mechanism to meet these requirements, or will it ultimately work less to write a parser yourself?

+6
source share
2 answers

There are several quick hacks, parsing as normal (not stable), and performing small checks on the input (for example, when analyzing a non-negative number, check that it does not have a “-” character).

The ultimate reliability check is converting an integer to text and checking that the input text and output text match. When working in the text version, you can relax things, for example, take leading 0s or spaces.

+1
source

Basically you want to use the num_get<char> face in C. This is somewhat complicated, so see this example . Basically, you should call use_facet<num_get<char,string::iterator> > (locale::classic).get(begin, end, ... , outputValue) .

+1
source

Source: https://habr.com/ru/post/956892/


All Articles