Spanish text in .py files

Question

Spanish text in .py files

This is the code

A = "Diga sí por cualquier número de otro cuidador.".encode("utf-8")

I get this error:

'ascii' codec cannot decode byte 0xed at position 6: serial number not in range (128)

I tried numerous encodings unsuccessfully.

Edit:

I already have this in the beginning

 # -*- coding: utf-8 -*-

Go to

 A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")

Does not help

+6

python character-encoding

Kamal Saini May 30, '11 at 17:00

source share

7 answers

Mrab · Answer 1 · 2011-05-30T17:09:21+0000

Are you using Python 2?

In Python 2, this string literal is a byte. You are trying to encode it, but you can only encode a Unicode string, so Python will first try to decode the byte string into a Unicode string using the standard ascii encoding.

Unfortunately, your string contains non-ASCII characters, so it cannot be decoded in Unicode.

A better solution is to use a Unicode string literal, for example:

 A = u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")

John machin · Answer 2 · 2011-05-30T21:57:37+0000

Error message: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128)

says the 7th byte is 0xed . This is either the first byte of the UTF-8 sequence for some (possibly CJK) highly unicode Unicode character (which is completely inconsistent with the reported facts), or its i-sharp encoding in Latin1 or cp1252. I bet on cp1252.

If your file was encoded in UTF-8, the byte violation will not be 0xed , but 0xc3 :

 Preliminaries: >>> import unicodedata >>> unicodedata.name(u'\xed') 'LATIN SMALL LETTER I WITH ACUTE' >>> uc = u'Diga s\xed por' What happens if file is encoded in UTF-8: >>> infile = uc.encode('utf8') >>> infile 'Diga s\xc3\xad por' >>> infile.encode('utf8') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128) #### NOT the message reported in the question #### What happens if file is encoded in cp1252 or latin1 or similar: >>> infile = uc.encode('cp1252') >>> infile 'Diga s\xed por' >>> infile.encode('utf8') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 6: ordinal not in range(128) #### As reported in the question ####

Having # -*- coding: utf-8 -*- at the beginning of your code, it does not magically guarantee that your file is encoded in UTF-8, before you and your text editor.

Actions:

save the file as UTF-8.
As suggested by others, you need u'blah l

Ezequiel bertti · Answer 3 · 2011-05-30T17:07:35+0000

enter the first line of your code:

 # -*- coding: utf-8 -*-

Eser Aygün · Answer 4 · 2011-05-30T17:07:51+0000

You must specify the encoding of the source file by adding the following line to the very beginning of your code (provided that your file is encoded in UTF-8):

 # Encoding: UTF-8

Otherwise, Python will accept ASCII encoding and crash during parsing.

Xaerxess · Answer 5 · 2011-05-30T17:10:52+0000

You are probably working with a regular string, and not with unicode encoding:

 >> type(u"zażółć gęślą jaźń") -> <type 'unicode'> >> type("zażółć gęślą jaźń") -> <type 'str'>

So

 u"Diga sí por cualquier número de otro cuidador.".encode("utf-8")

must work.

If you want to use the default unicode strings, put

 # -*- coding: utf-8 -*-

in the first line of your script.

Spanish text in .py files

More articles: