Getting output to win the console when throwing an exception containing a unicode literal (u "\ u0410")

I ran into an obscure issue when a raised Python exception was thrown to win the console. If the exception message contains any unicode literal, it does not print at all or prints incorrectly. Console Encoding - CP866

When the default encoding for python is ascii.

raise LookupError(u"symbol: \u0411") 

It turns out like:

LookupError


When I set the default encoding for utf-8, I get

LookupError: character: ╨C


When i do

 print u"symbol: \u0411" 

In both cases, I get:

symbol: B

Why is there such a difference in behavior? What should I do to do everything right?

+4
source share
1 answer

When an exception is printed and a Unicode message is sent, Python will try to encode it using the encoding returned by sys.getdefaultencoding() . If this fails, the encoding error is suppressed, and you get a strange output.

In a print situation, a Unicode string is encoded using sys.stdout.encoding . Yes, it would be better if excepthook used sys.stderr.encoding rather than sys.getdefaultencoding() .

Please note that the following works.

 raise LookupError(u"symbol: \u0411".encode(your_encoding)) 

You can also change the default encoding in sitecustomize or usercustomize by calling sys.setdefaultencoding(your_encoding) . Your system must be configured so that the default encoding is sys.stderr.encoding (and the encoding of other standard streams).

Also, this problem does not exist in Python 3.

+1
source

Source: https://habr.com/ru/post/1434944/


All Articles