Let's say
s = u"test\u0627\u0644\u0644\u0647 \u0623\u0643\u0628\u0631\u7206\u767A\u043E\u043B\u043E\u043B\u043E"
If I try to print it directly,
>>> print s Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'cp932' codec can't encode character u'\u0627' in position 4: illegal multibyte sequence
So, I am changing the console to UTF-8 from Python (otherwise it will not understand my input).
import win32console win32console.SetConsoleOutputCP(65001) win32console.SetConsoleCP(65001)
And then print the string encoded as utf-8, because Python does not know that chcp 65001 is UTF-8 (known bug ).
>>> print s.encode('utf-8') testالله أكبر爆発Traceback (most recent call last): File "<stdin>", line 1, in <module> IOError: [Errno 0] Error
As you can see, it prints successfully until it hits a new line, then it throws an IOError.
The following workaround works:
def safe_print(str): try: print str.encode('utf-8') except: pass print >>> safe_print(s) testالله أكبر爆発
But there must be a better way. Any suggestions?
source share