Python 3 default encoding cp1252

Recently, I ran into some problems decoding the descriptor (with error display 0x81, 0x8D) from the Biopython module with the installation of anaconda 4.1.1 python 3.5.2 on sony vaio windows 10 system

After some research, it seems that the problem may be that the default decoding codec is cp1252. I ran the code below and found that the codec is actually set to cp1252 by default.

However, a few posts suggest that python 3 should have installed the default codec in utf8. It's right? If so, why my cp1252 and how can I solve it? import locale os_encoding = locale.getpreferredencoding()

+5
source share
1 answer

According to What's New in Python 3.0 ,

There is a platform-specific default encoding [...] In many cases, but not all, the default is UTF-8; you should never count on this default value.

and

PEP 3120: The default source encoding is now UTF-8.

In other words, Python opens source files by default as UTF-8, but any interaction with the file system will depend on the environment. It is strongly recommended that you use open(filename, encoding='utf-8') to read the file.

Another change is that b'bytes'.decode() and 'str'.encode() with no argument use utf-8 instead of ascii.

Python 3.6 changes a few more default values:

PEP 529: Change Windows File System Encoding to UTF-8

PEP 528: Changing the Encoding of a Windows Console to UTF-8

But the default encoding for open() is still all that Python can get out of the environment.

It looks like 3.7 will add a (opt-in!) Mode in which the coding of the ecological locale is ignored, and all of UTF-8 is all the time (except in specific cases where Windows uses UTF-16, I suppose), see PEP 0540 and corresponding issue 29240 .

+5
source

Source: https://habr.com/ru/post/1263880/


All Articles