How to translate dual UTF-8 decoder code in Python on Lua

I have this code snippet of legacy code that (apparently) decodes UTF-8 double-encoded text back to regular UTF-8:

# Run with python3!
import codecs
import sys
s=codecs.open('doubleutf8.dat', 'r', 'utf-8').read()
sys.stdout.write(
                s
                .encode('raw_unicode_escape')
                .decode('utf-8')
        )

I need to translate it into Lua and imitate all the possible side effects of decoding (if any).

Limitations: I can use any of the available Lua modules to handle UTF-8, but preferably stable, with LuaRocks support. I will not use Lupa or another Lua-Python-bridge solution, and I will not call os.execute()to call Python.

+3
source share
1 answer

lua-iconv, Lua iconv library. , .

LuaRocks.

: , Lua:

require 'iconv'
-- convert from utf8 to latin1
local decoder = iconv.new('latin1', 'utf8')
local data = io.open('doubleutf8.dat'):read('*a')
-- decodedData is encoded in utf8
local decodedData = decoder:iconv(data)
-- if your terminal understands utf8, prints " "
-- if not, you can further convert it from utf8 to any encoding, like KOI8-R
print(decodedData)
+3

Source: https://habr.com/ru/post/1792860/


All Articles