Sample code in Coliru :
#include <iostream> #include <sstream> #include <string> int main() { double d; std::string s; std::istringstream iss("234cdefipxngh"); iss >> d; iss.clear(); iss >> s; std::cout << d << ", '" << s << "'\n"; }
Here I read N3337 (presumably this is the same as C ++ 11). In [istream.formatted.arithmetic] we have (rephrase):
operator>>(double& val);
As with the inserts, these extractors depend on the local num_get <> object (22.4.2.1) to analyze the input stream data. These extractors behave as formatted input functions (as described in 27.7.2.2.1). After creating the watch object, the conversion occurs as if it were performed by the following code fragment:
typedef num_get< charT,istreambuf_iterator<charT,traits> > numget;
iostate err = iostate::goodbit;
use_facet< numget >(loc).get(*this, 0, *this, err, val);
setstate(err);
A look at 22.4.2.1:
The details of this operation are performed in three steps.
- Step 1: definition of the conversion specifier
- Step 2: Extract the characters from and determine the appropriate char value for the format expected from the conversion specification defined in step 1.
- Step 3: Save the results
In the description of stage 2, it is too long for me to insert it all here. However, it is clearly stated that all symbols must be extracted before attempting conversion; and, in addition, it is necessary to extract the following characters:
- any of
0123456789abcdefxABCDEFX+- - Locale
decimal_point() - Locale
thousands_sep()
Finally, the rules for stage 3 include:
- for the floating point strtold function.
The numeric value that you want to save can be one of the following:
- zero if the conversion function cannot convert the entire field.
All of this, obviously, clearly indicates that the output of my code should be 0, 'ipxngh' . However, he does deduce something else.
Is this a compiler / library error? Is there any position that I skip for the locale to change the behavior of stage 2? (In another question, someone posted an example of a system that really extracts characters, but also extracts ipxn that are not listed in the list specified in N3337).
Update
As perreal pointed out, this text from stage 2 matters:
If true is selected, then if. has not yet accumulated, then the characterβs position will be remembered, but the character is otherwise ignored. Otherwise, if. already accumulated, the character is discarded, and step 2 ends. If it is not discarded, then it is checked whether c allowed as the next character of the input field of the conversion specifier returned by step 1. If so, it accumulates.
If a character is either discarded or accumulated, then it advances in ++ and processes returns to the beginning of stage 2.
So, step 2 can end if the character is in the list of valid characters, but is not a valid character for %g . He does not say exactly, but apparently this refers to the definition of fscanf from C99, which allows:
- a non-empty sequence of decimal digits, optionally containing a decimal point character, then an optional part of the exponent, as defined in 6.4.4.2;
- a 0x or 0X, then a non-empty sequence of hexadecimal digits, optionally containing the decimal point, then the optional binary part of the exponent, as defined in 6.4.4.2;
- INF or INFINITY ignoring the case
- NAN or NAN (n-char -sequence opt), ignoring case in the NAN part, where:
and
Unlike the "C" locale, additional forms of a sequence of objects specific to the locale can be taken.
So, actually Coliru's conclusion is correct; and in fact, the processing should try to check the sequence of characters extracted before the actual input in %g , while extracting each character.
The next question: is it allowed, as in the thread associated with earlier, to accept i , n , p , etc. in stage 2?
These are valid characters for %g , however, they are not included in the list of atoms that are allowed to Read Stage 2 (i.e. c == 0 for my last quote, so the character is not discarded and does not accumulate).