Decode S-JIS string in UTF-8

I am working on a Japanese file and I do not know the language. The file is encoded in S-JIS. Now I have to convert the content to UTF-8 so that the content looks like Japanese. And here I am completely empty. I tried the following code, which I found somewhere on the Internet, but no luck:

byte[] arrByte = Encoding.UTF8.GetBytes(arrActualData[x]);
string str = ASCIIEncoding.ASCII.GetString(arrByte);

Can anyone help me with this?

Thanks in advance Kunal

+3
source share
1 answer

In C #, the following code works for me. I wanted to try this to prove my results below:

public void Convert()
{
   using (TextReader input = new StreamReader(
     new FileStream("shift-jis.txt", FileMode.Open), 
       Encoding.GetEncoding("shift-jis")))
   {
      using (TextWriter output = new StreamWriter(
        new FileStream("utf8.txt", FileMode.Create), Encoding.UTF8))
      {
        var buffer = new char[512];
        int len;

        while ((len = input.Read(buffer, 0, 512)) > 0)
        {
          output.Write(buffer, 0, len);
        }
      }
   }
}

Shown here is a file encoded in shift-jis (or SJIS / Shift_JIS they are the same ), using JEdit to check the encoding (the word in the file is the text in the Japanese text テ ス ト):
alt text

, (utf8.txt):
alt text

, - .

+2

Source: https://habr.com/ru/post/1784038/


All Articles