Can i use memcmp two compare multibyte characters string?

I am trying to write code to compare two strings. In windows I can use strcmp, but I want to write for a multibyte character string so that it is compatible with all other platform. Can I use memcmp? if not, is there any other API that I can use, or do I need to write my own API.

+6
source share
3 answers

You have to be careful. I am not an expert in Unicode / multi-byte encodings, but I know that with diacritics sometimes two lines can be considered equal when their bytes are not exactly the same. He recommended using pre-tested APIs because string encodings can get pretty dirty.

See an old new thing when matching cases . I can’t come up with a link for diacritics, but if I do, I will publish it.

+5
source

If two lines use the same encoding, you can use memcmp . If they use UTF-8, you can even use strcmp since 0 does not appear in UTF-8 encoded strings. Another option is to convert your strings to wide characters using mbstowcs .

+2
source

If the strings use the same encoding, memcmp will work fine. However, keep in mind that wide characters are of different sizes on different platforms.

If strings use different encodings, you will need a library such as ICU to deal with it.

+1
source

Source: https://habr.com/ru/post/909428/


All Articles