Glib :: ustring and japanese characters

Glib :: ustring should work well with UTF8, but I have a problem when working with Japanese strings.

If you compare the two lines, β€œγ‚ た し” and β€œγƒ― γ‚Ώ シ”, ​​using the β€œAnd” or β€œCompare” method, he will answer that the two lines are equal.

I do not understand why. How does Glib :: ustring work?

The only way I found a false comparison was to compare strings of different sizes. For example, "ζ΅·ε€– わ た わ" and "ζ΅·ε€– わ た".

Very strange...

+4
source share
2 answers

Glib::ustring::compare uses g_utf8_collate() internally, which compares strings according to the rules of the current locale. Is your locale set to anything other than Japanese?

+1
source
 #include <iostream> #include <glibmm/ustring.h> int main() { Glib::ustring s1 = "γ‚γŸγ—"; Glib::ustring s2 = "γƒ―γ‚Ώγ‚·"; std::cerr << (s1 == s2) << std::endl; return 0; } 

Output: 0

EDIT: But I dug a little deeper:

 #include <iostream> #include <glibmm.h> int main() { Glib::ustring s1 = "γ‚γŸγ—"; Glib::ustring s2 = "γƒ―γ‚Ώγ‚·"; std::cout << (s1 == s1) << std::endl; std::cout << (s1 == s2) << std::endl; std::locale::global(std::locale("")); std::cout << (s1 == s1) << std::endl; std::cout << (s1 == s2) << std::endl; std::cout << s1 << std::endl; std::cout << s2 << std::endl; return 0; } 

Conclusion:

 1 0 1 1γ‚γŸγ—γƒ―γ‚Ώγ‚· 

And that sounds weird.

+1
source

Source: https://habr.com/ru/post/1304439/


All Articles