Convert utf-8 std :: string to std :: wstring on iPhone

I have a UTF-8 string (created by std :: string from an array of bytes) I understand that encoding means that size () / length () will not give me the actual number of glyphs if the text is Chinese, for example ... I understand that in order to get the Unicode character code for each glyph, I need to convert it to wstring (or any representation of UTF> 8), and then I can get a value that will represent what I want.

I looked around and did not find an easy way to do this with std C ++. What am I missing?

I am building gcc 4+ on an Apple iPhone using cocoa -touch framework.

+3
source share
5 answers

To get the number of characters / codes utf8 'in std :: string, you can do this: move the string if char is between 0 and 127, this is one byte character, between 194 and 223 it has 2 byte characters (so as a result) , between 224 and 239 it has 3 bytes (thus, as a result), between 240 and 244 it has 4 bytes (thus, as a result).

Since wchar_t on Iphone, I think 32 bits, if you really want wstring , you can use UTF8CPP to convert to UTF32. UTF8CPP can also give you the point codes of your string.

But I do not understand why you are using C ++ for Iphone? See here: Objective-C Tuesdays: wide character strings

+2

, UTF-8 UTF-32 ( wstring), , wchar_t . . : http://www.unicode.org/reports/tr15/.

, UTF-8 UTF-32, UPF-8 CPP library :

wstring utf32result;
utf8::utf8to32(utf8string.begin(), utf8string.end(), back_inserter(utf32result));
+2
+1

++ utf-8 unicode. API .

utf-8 std::string, , , , utf-8.

0

Well, it's not easy, and I have not used it myself, but the locale classes should help in converting your string. From the description, you can use the ctype :: widen method to convert between char and wchar.

0
source

Source: https://habr.com/ru/post/1762437/


All Articles