Advance the UTF-8 character to the next

I want to change the character of UTF-8 (which is in the gchar array), so it gets the value of the next character in accordance with the standard. I use glib and I do not see such a function. I’m thinking about a possible solution, but it will probably require more effort, and, of course, it will not be the most effective, since I don’t know too much about encodings. Is there a library that can do this? Googling didn't help.

+3
source share
2 answers

This is essentially just add-and-carry modulo 64. Consider character bytes as “digits”. You increase the last byte, and if it overflows, reset it has the smallest possible value and increases the second-last byte.

, :

e0 b0 be -> e0 b0 bf

:

e0 b0 bf -> e0 b1 80

:

e0 bf bf -> e1 80 80

, , , , .

+6

-, - (untested):

gunichar c;
int len, old_len;
char buf[6];

c = g_utf8_get_char(s);
old_len = g_unichar_to_utf8(c, NULL);
c += 1;
len = g_unichar_to_utf8(c, buf);
if (len == old_len) {
  memcpy(s, buf, len);
} else {
  /* something more complex adjusting s length */
}

, . g_utf8_next_char(), old_len , , old_len.

+2

Source: https://habr.com/ru/post/1790137/


All Articles