I am trying to write a line of code that will take a line of Japanese text and delete a specific character set. However, I am having problems using Unicode characters inside a regex.
I am currently using text.gsub(/ใ.*?ใ/u, '') , but getting an error
'gsub': invalid byte sequence in Windows-31J (Argument error)
Can someone tell me what I am doing wrong?
Sample text: ใ ใฎ ไป ่ "ใ ใ ใ" ใ ใ ใพ ใ ใซ ็ก ้ ไฝ ใ ใ ใ ใ ใ ใ ใ ใฃ ใ ใฎ ใง
Expected Result: ใ ใฎ ไป ่ ใ ใ ใพ ใ ใซ ็ก ้ ไฝ ใ ใฃ ใ ใฎ ใฎ
thanks
edit: # encoding: utf-8 present at the top of the script.
source share