Python - How can I find a string in a Unicode character, which is a variable?

Question

Python - How can I find a string in a Unicode character, which is a variable?

It works

s = 'jiā' s.find(u'\u0101')

How do I do something like this:

 s = 'jiā' zzz = '\u0101' s.find(zzz)

Since I am using a variable now, how to specify the string represented by this variable, is it Unicode?

+6

python unicode

Steve Nov 11 '11 at 16:39

source share

3 answers

zzz , as defined in your post, is a simple str object, not a unicode object, so there is no way to indicate that this is what it really is not. You can convert the str object to a unicode object, however, specifying the encoding:

 s.find(zzz.decode("utf-8"))

Substitution of utf-8 using any encoding into which the string is encoded.

Please note that in your example

 zzz = '\u0101'

zzz is a simple string of length 6. After this, there is no easy way to fix this invalid string literal, except for hacks along the strings

 ast.literal_eval("u'" + zzz + "'")

+2

Sven marnach Nov 11 '11 at 16:43

source share

In some cases (I ignore when), you will also have to decode the line in which you look:

 s.decode("utf-8").find(u"\u0101")

0

Cesc Mar 18 '14 at 8:11

source share

kindall · Accepted Answer · 2011-11-11T16:47:17+0000

Since I am using a variable now, how to specify the string represented by this variable, is it Unicode?

Defining it as a Unicode string in the first place.

 zzz = u"foo"

Or, if you already have a string in some other encoding, by converting it to Unicode (the source encoding should be specified if the string is not ASCII).

 zzz = unicode(zzz, encoding="latin1")

Or using Python 3 where all the lines are Unicode.

Python - How can I find a string in a Unicode character, which is a variable?

More articles: