Percent Encoding UTF-8 and Python

Question

Percent Encoding UTF-8 and Python

I am trying to get python to give me the percentage of encoded strings. The API I'm interacting with (which I think uses the percentage of UTF-8 encoded) gives% c3% ae for î. However, python urllib.quote gives% 3F.

import urllib mystring = "î" print urllib.quote(mystring) print urllib.quote_plus(mystring) print urllib.quote(mystring.encode('utf-8'))

Any help was appreciated.

+4

python utf-8 url-encoding

user1379351 Aug 10 '13 at 14:46

source share

2 answers

This is because you are not declaring the encoding your file uses, so Python infers it from the current locale configuration. I suggest you do this:

 # -*- coding: utf-8 -*- import urllib mystring = "î" print urllib.quote(mystring) print urllib.quote_plus(mystring)

And make sure your file.py gets to utf-8 encoded disk .

For me it gives:

 $python ex.py %C3%AE %C3%AE

A few caveats. If you are trying to do this from the interpreter, # -*- coding: utf-8 -*- will not work if your console encoding is not utf-8 . Instead, you should change it to any encoding used by your console: # -*- coding: (encoding here) -*- .

Then you should decode your string in Unicode using the decode method and pass it the encoding name used by your console as an argument:

 mystring = "î".decode('<your encoding>')

Then go to urllib encoded as utf-8 :

 print urllib.quote(mystring.encode('utf-8')) print urllib.quote_plus(mystring.encode('utf-8'))

Hope this helps!

+1

Paulo bu Aug 10 '13 at 14:59

source share

Viktor Kerkez · Accepted Answer · 2013-08-10T14:53:49+0000

Your file should encode your string as utf-8 before quoting it, and the string should be unicode. You must also specify the appropriate file encoding for the source file in the coding section:

 # -*- coding: utf-8 -*- import urllib s = u'î' print urllib.quote(s.encode('utf-8'))

Gives me the conclusion:

 %C3%AE

Percent Encoding UTF-8 and Python

More articles: