Ruby 1.9.2 Character encoding: invalid multibyte character: /? /

I am trying to understand why this piece of code does not work in Ruby 1.9.2. I am also trying to understand how it should be changed so that its work is done. Here is a snippet:

ruby-1.9.2-p290 :009 > str = "hello world!" => "hello world!" ruby-1.9.2-p290 :010 > str.gsub("\223","") RegexpError: invalid multibyte character: /?/ from (irb):10:in `gsub' 
+4
source share
1 answer

Your ruby ​​is in UTF-8 mode, but "\223" not a valid UTF-8 string. When you are in UTF-8, any byte with an eighth bit means that you are in a multibyte character, and you need to continue reading more bytes to get the full character; this means that "\223" is only part of the UTF-8 encoded character, hence your error.

0223 and 0224 (147 and 148 decimal) are smart quotes in the Windows-1252 set , but Windows-1252 isn’t UTF-8. In UTF-8, you want "\u201c" and "\u201d" for quotes:

 >> puts "\u201c" " >> puts "\u201d" " 

So, if you are trying to remove quotes, you will probably need one of them:

 str.gsub("\u201c", "").gsub("\u201d", "") str.gsub(/[\u201c\u201d]/, '') 
+9
source

Source: https://habr.com/ru/post/1382406/


All Articles