The easiest way to get the UTF-8 substring in Julia

Julia's UTF-8 string cannot use the slice operator because it cuts the byte index of the string, not the character. for example

s = "ポケットモンスター"
s[1:4]

s [1: 4] will be "ポ ケ" not "ポ ケ ッ ト".

I would like to know the simplest and most readable way to get the UTF-8 substring in Julia.

+4
source share
2 answers

You might want to use UTF32Stringinstead UTF8Stringif you are going to do this a lot, and only if necessary, if necessary, converting to UTF8String.

+3
source

, ( Julia). , :

substr(s,i,j) = s[chr2ind(s,i):chr2ind(s,j)]

substr(s,1,4)

"ポケット"

+5

Source: https://habr.com/ru/post/1626718/


All Articles