Iostreams - print `wchar_t` or` charXX_t` value as a character

If you feed wchar_t , char16_t or char32_t to a narrow stream, it will print the numeric value of the code point.

 #include <iostream> using std::cout; int main() { cout << 'x' << L'x' << u'x' << U'x' << '\n'; } 

outputs x120120120 . This is due to the fact that there is operator<< for a particular combination of basic_ostream with its charT , but for other types of characters there are no similar operators, so they are silently converted to int and printed in this way. Likewise, not narrow string literals ( L"x" , u"x" , u"x" ) will be seamlessly converted to void* and printed as a pointer value, and not narrow string objects ( wstring , u16string , u32string ) will not even compile.

So the question is: what is the least terrible way to print the value of wchar_t , char16_t or char32_t in a narrow ostream, as a character, and not as a numeric codeword value? It must correctly convert all code points that are represented in the ostream encoding to this encoding, and shall report an error when the code is not representable. (For example, given u'โ€ฆ' and UTF-8 ostream, the three-byte sequence 0xE2 0x80 0xA6 should be written to the stream, but given u'รข' and the stream KOI8-R, an error should be reported.)

Similarly, how can you print a non-narrow C-string or a string object in a narrow ostream, converting it to output encoding?

If this cannot be done in ISO C ++ 11, I will take answers to specific platforms.

(Inspired by this question .)

+6
source share
1 answer

As you noted, for a narrow stream there is no operator<<(std::ostream&, const wchar_t) . If you want to use the syntax, you can teach ostream how to do it with wchar so that this procedure is chosen as the best overload, which first requires conversion to an integer.

If you feel adventurous:

 namespace std { ostream& operator<< (ostream& os, wchar_t wc) { if(unsigned(wc) < 256) // or another upper bound return os << (unsigned char)wc; else throw your_favourite_exception; // or handle the error in some other way } } 

Otherwise, create a simple struct that transparently includes wchar_t and has a custom friend operator<< and converts your wide characters into this before outputting them.

Edit: To convert on the fly to and from the locale, you can use the <cwchar> functions, for example:

 ostream& operator<< (ostream& os, wchar_t wc) { std::mbstate_t state{}; std::string mb(MB_CUR_MAX, '\0'); size_t ret = std::wcrtomb(&mb[0], wc, &state); if(ret == static_cast<std::size_t>(-1)) deal_with_the_error(); return os << mb; } 

Do not forget to set your language in the system by default:

 std::locale::global(std::locale("")); std::cout << L'ลญ'; 
+2
source

Source: https://habr.com/ru/post/1013105/


All Articles