Is there any advantage to using graphemes
over split
to create an array from a UTF-8 string?
For example, consider the following:
# Define a UTF-8 string with a bunch of multibyte characters
s = "{(-nββ΅Γ·βββ΅),β¨β1βββ.=β¨β³nβ1-β¨β’β΅}"
# Create an array using split
split(s, "")
# Create an array using graphemes (v0.4+)
collect(graphemes(s))
Both approaches give the expected result. And indeed
split(s, "") == collect(graphemes(s))
returns true
.
Both approaches seem to consistently produce equivalent results. Is one approach usually preferable to another, whether for performance, style, or otherwise?
(Note that graphemes
iterator returns, not an array, therefore collect
.)
source
share