Spanish text in .py files

This is the code

A = "Diga sí por cualquier número de otro cuidador.".encode("utf-8") 

I get this error:

'ascii' codec cannot decode byte 0xed at position 6: serial number not in range (128)

I tried numerous encodings unsuccessfully.

Edit:

I already have this in the beginning

 # -*- coding: utf-8 -*- 

Go to

 A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8") 

Does not help

+6
source share
7 answers

Are you using Python 2?

In Python 2, this string literal is a byte. You are trying to encode it, but you can only encode a Unicode string, so Python will first try to decode the byte string into a Unicode string using the standard ascii encoding.

Unfortunately, your string contains non-ASCII characters, so it cannot be decoded in Unicode.

A better solution is to use a Unicode string literal, for example:

 A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8") 
+4
source

Error message: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128)

says the 7th byte is 0xed . This is either the first byte of the UTF-8 sequence for some (possibly CJK) highly unicode Unicode character (which is completely inconsistent with the reported facts), or its i-sharp encoding in Latin1 or cp1252. I bet on cp1252.

If your file was encoded in UTF-8, the byte violation will not be 0xed , but 0xc3 :

 Preliminaries: >>> import unicodedata >>> unicodedata.name(u'\xed') 'LATIN SMALL LETTER I WITH ACUTE' >>> uc = u'Diga s\xed por' What happens if file is encoded in UTF-8: >>> infile = uc.encode('utf8') >>> infile 'Diga s\xc3\xad por' >>> infile.encode('utf8') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128) #### NOT the message reported in the question #### What happens if file is encoded in cp1252 or latin1 or similar: >>> infile = uc.encode('cp1252') >>> infile 'Diga s\xed por' >>> infile.encode('utf8') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128) #### As reported in the question #### 

Having # -*- coding: utf-8 -*- at the beginning of your code, it does not magically guarantee that your file is encoded in UTF-8, before you and your text editor.

Actions:

  • save the file as UTF-8.
  • As suggested by others, you need u'blah l
+3
source

enter the first line of your code:

 # -*- coding: utf-8 -*- 
+1
source

You must specify the encoding of the source file by adding the following line to the very beginning of your code (provided that your file is encoded in UTF-8):

 # Encoding: UTF-8 

Otherwise, Python will accept ASCII encoding and crash during parsing.

0
source

You are probably working with a regular string, and not with unicode encoding:

 >> type(u"zażółć gęślą jaźń") -> <type 'unicode'> >> type("zażółć gęślą jaźń") -> <type 'str'> 

So

 u"Diga sí por cualquier número de otro cuidador.".encode("utf-8") 

must work.

If you want to use the default unicode strings, put

 # -*- coding: utf-8 -*- 

in the first line of your script.

See also docs .

PS He is Polish in the examples above :)

0
source

In the first or second line of your code, enter a comment:

  # -*- coding: latin-1 -*- 

For a list of supported characters, see http://en.wikipedia.org/wiki/Latin-1_Supplement_%28Unicode_block%29

And the languages ​​covered: http://en.wikipedia.org/wiki/ISO_8859-1

0
source

Perhaps this is what you want to do:

 A = 'Diga sí por cualquier número de otro cuidador'.decode('latin-1') 

And don't forget to add # -*- coding: latin-1 -*- to the top of your code.

0
source

Source: https://habr.com/ru/post/889394/


All Articles