Why does Python print string and unicode with the same value in different ways?

I am using Python 2.6.5, and when I run the following in the Python shell, I get:

>>> print u'Andr\xc3\xa9'
André
>>> print 'Andr\xc3\xa9'
André
>>>

What is the explanation of the above? Given u'Andr \ xc3 \ xa9 ', how can I correctly display the above value on the html page so that it shows André instead of Andr ??

+3
source share
3 answers

'\xc3\xa9'is the UTF-8 encoding of the unicode character u'\u00e9'(which can also be specified as u'\xe9'). So you can use u'Andr\u00e9'or u'Andr\xe9'.

You can convert from one to another:

>>> 'Andr\xc3\xa9'.decode('utf-8')
u'Andr\xe9'
>>> u'Andr\xe9'.encode('utf-8')
'Andr\xc3\xa9'

, print 'Andr\xc3\xa9' , - UTF-8. , Windows :

>>> print 'Andr\xc3\xa9'
André

HTML, , - HTML. (, Django) unicode , .

+11

:

>>> unicode('Andr\xc3\xa9', 'utf-8')
u'Andr\xe9'
>>> print u'Andr\xe9'
André

.

EDIT: .

+1

, , . , utf-8 unicode.

HTML , HTML . , Python codecs .

-2
source

Source: https://habr.com/ru/post/1744363/


All Articles