Download EUC-JP and other Japanese text encodings in Node.JS

I am trying to clear some Japanese sites for a personal project. Sites with text in UTF-8 work fine, as expected, but I can not get any text from sites that define other international encodings, in particular EUC-JP. Node also seems to interpret the text and make modifications, rather than passing it to raw - I tried to set an answer that would be interpreted as ascii and binary, and then install my terminal application in EUC-JP, but after that a console.log(), do not lead to the actual text.

I had a check through the Node documentation, and it looks like it only supports two main text encodings (except for binary and base64.)

I use the built-in http client and set the encoding using a method response.setEncoding, for example.response.setEncoding('utf8');

How do other people work with international text in Node (especially in situations where the source data is not in UTF-8?) Are binary buffers the only way?

While I was doing a little research, I am not very good at character encoding, so simple answers will be appreciated. Thank!

+3
source share
1 answer

There is a module that adds snap to the iconv node.js . If you take the answer as binary Buffer, you can use it Iconv.convertto convert it from EUC-JP to UTF-8 (see README for an example).

+2
source

Source: https://habr.com/ru/post/1787383/


All Articles