The utf-8 character in the user path does not allow importing a module

When trying to import nltk module

The following error message is shown:

I have the 0xb3 ( Ε‚ ) character in my username, but it bothers me that other modules, such as re , codecs , etc., are imported successfully.

Is it possible to solve this problem on the Python side (without changing the username on the system)?

 File "C:\Python27\lib\ntpath.py", line 310, in expanduser return userhome + path[i:] UnicodeDecodeError: 'ascii' codec can't decode byte 0xb3 in position 13: ordinal not in range(128) 
+6
source share
1 answer

As in the ntpath.py file, there is no encoding for the unicode username, you need to add the following command to expanduser(path) in ntpath.py :

 if isinstance(path, unicode): userhome = unicode(userhome,'unicode-escape').encode('utf8') 

therefore, the expanduser function should be as follows:

 def expanduser(path): """Expand ~ and ~user constructs. If user or $HOME is unknown, do nothing.""" if isinstance(path, bytes): tilde = b'~' else: tilde = '~' if not path.startswith(tilde): return path i, n = 1, len(path) while i < n and path[i] not in _get_bothseps(path): i += 1 if 'HOME' in os.environ: userhome = os.environ['HOME'] elif 'USERPROFILE' in os.environ: userhome = os.environ['USERPROFILE'] elif not 'HOMEPATH' in os.environ: return path else: try: drive = os.environ['HOMEDRIVE'] except KeyError: drive = '' userhome = join(drive, os.environ['HOMEPATH']) if isinstance(path, bytes): userhome = userhome.encode(sys.getfilesystemencoding()) if isinstance(path, unicode): userhome = unicode(userhome,'unicode-escape').encode('utf8') if i != 1: #~user userhome = join(dirname(userhome), path[1:i]) return userhome + path[i:] 
+1
source

Source: https://habr.com/ru/post/979044/


All Articles