Ruby will not play well with UTF-8 strings. I pass the data in an XML file, and although the XML document is listed as UTF-8, it treats the ascii encoding (two bytes per character) as separate characters.
I started to encode input strings in '\ uXXXX' format, but I can't figure out how to convert this to the actual UTF-8 character. I searched all this on this site, and google - to no avail, and my disappointment is pretty high. I am using Ruby 1.8.6
Basically, I want to convert the string '\ u03a3' → "Σ".
I have had:
data.gsub /\\u([a-zA-Z0-9]{4})/, $1.hex.to_i.chr
Which of course gives the error "931 of char".
Thanks. Tim
source
share