Python string for unicode

Possible duplicate:
How to handle ASCII string as unicode and unescape escaped characters in it in python?
How to convert Unicode escape sequences to Unicode characters in a python string

I have a string that contains Unicode characters, for example. \u2026 etc. Somehow it was not received to me as unicode , but received as str . How to convert it back to unicode?

 >>> a="Hello\u2026" >>> b=u"Hello\u2026" >>> print a Hello\u2026 >>> print b Hello… >>> print unicode(a) Hello\u2026 >>> 

So unicode(a) not the answer. Then what is it?

+47
python string unicode python-unicode
Apr 22 2018-12-22T00:
source share
3 answers

Unicode escapes only works on unicode strings, so this

  a="\u2026" 

- actually a string of 6 characters: '\', 'u', '2', '0', '2', '6'.

To make unicode from this, use decode('unicode-escape') :

 a="\u2026" print repr(a) print repr(a.decode('unicode-escape')) ## '\\u2026' ## u'\u2026' 
+68
Apr 22 '12 at 13:59
source share

Decode it with the unicode-escape codec:

 >>> a="Hello\u2026" >>> a.decode('unicode-escape') u'Hello\u2026' >>> print _ Hello… 

This is because for a string other than Unicode, \u2026 not recognized, but instead is treated as a literal series of characters (to make it clearer, 'Hello\\u2026' ). You need to decode the screens, and a unicode-escape codec can do this for you.

Note that you can get unicode to recognize it in the same way by specifying the codec argument:

 >>> unicode(a, 'unicode-escape') u'Hello\u2026' 

But a.decode() method is more enjoyable.

+23
Apr 22 '12 at 13:59
source share
 >>> a="Hello\u2026" >>> print a.decode('unicode-escape') Hello… 
+16
Apr 22 '12 at 14:00
source share



All Articles