The easiest way to get the UTF-8 substring in Julia

Question

The easiest way to get the UTF-8 substring in Julia

Julia's UTF-8 string cannot use the slice operator because it cuts the byte index of the string, not the character. for example

s = "ポケットモンスター"
s[1:4]

s [1: 4] will be "ポケ" not "ポケット".

I would like to know the simplest and most readable way to get the UTF-8 substring in Julia.

+4

utf-8 julia-lang

Pisit makpaisit Jan 31 '16 at 4:35

source share

2 answers

, ( Julia). , :

substr(s,i,j) = s[chr2ind(s,i):chr2ind(s,j)]

substr(s,1,4)

"ポケット"

+5

Dan Getz 31 . '16 11:44

Scott jones · Accepted Answer · 2016-01-31T15:33:13+0000

You might want to use UTF32Stringinstead UTF8Stringif you are going to do this a lot, and only if necessary, if necessary, converting to UTF8String.

The easiest way to get the UTF-8 substring in Julia

More articles: