Writing Unicode to a file using Python

My problem is that I can output Unicode characters to my terminal, but not to files. Demonstration:

user@ubuntu:~$ python -c 'print u"\u5000"'
ε€€
user@ubuntu:~$ python -c 'print u"\u5000"' >a.out
Traceback (most recent call last):
  File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u5000' in position 0: ordinal not in range(128)

The output of "locale":

LANG=en_US.UTF-8
LANGUAGE=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
+4
source share
2 answers

Since your terminal is configured to use UTF-8, Python knows how to encode a Unicode character when writing directly to the terminal. However, no encoding is specified when writing to the file, so Python defaults to ASCII. To write to a file, you need to explicitly specify the byte encoding.

python -c 'print u"\u5000".encode("UTF-8")' >a.out
+3
source

The problem was actually with Python. The solution set the value PYTHONIOENCODING = utf_8.

+1
source

Source: https://habr.com/ru/post/1526154/


All Articles