Percent Encoding UTF-8 and Python

I am trying to get python to give me the percentage of encoded strings. The API I'm interacting with (which I think uses the percentage of UTF-8 encoded) gives% c3% ae for î. However, python urllib.quote gives% 3F.

import urllib mystring = "î" print urllib.quote(mystring) print urllib.quote_plus(mystring) print urllib.quote(mystring.encode('utf-8')) 

Any help was appreciated.

+4
source share
2 answers

Your file should encode your string as utf-8 before quoting it, and the string should be unicode. You must also specify the appropriate file encoding for the source file in the coding section:

 # -*- coding: utf-8 -*- import urllib s = u'î' print urllib.quote(s.encode('utf-8')) 

Gives me the conclusion:

 %C3%AE 
+4
source

This is because you are not declaring the encoding your file uses, so Python infers it from the current locale configuration. I suggest you do this:

 # -*- coding: utf-8 -*- import urllib mystring = "î" print urllib.quote(mystring) print urllib.quote_plus(mystring) 

And make sure your file.py gets to utf-8 encoded disk .

For me it gives:

 $python ex.py %C3%AE %C3%AE 

A few caveats. If you are trying to do this from the interpreter, # -*- coding: utf-8 -*- will not work if your console encoding is not utf-8 . Instead, you should change it to any encoding used by your console: # -*- coding: (encoding here) -*- .

Then you should decode your string in Unicode using the decode method and pass it the encoding name used by your console as an argument:

 mystring = "î".decode('<your encoding>') 

Then go to urllib encoded as utf-8 :

 print urllib.quote(mystring.encode('utf-8')) print urllib.quote_plus(mystring.encode('utf-8')) 

Hope this helps!

+1
source

Source: https://habr.com/ru/post/1496319/


All Articles