Build a query string using python urlencode

I am trying to create a url to send a receive request using the urllib module.

Suppose my final_url should be

 url = "www.example.com/find.php?data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value" 

Now for this I tried as follows:

 >>> initial_url = "http://www.stackoverflow.com" >>> search = "Generate+value" >>> params = {"data":initial_url,"search":search} >>> query_string = urllib.urlencode(params) >>> query_string 'search=Generate%2Bvalue&data=http%3A%2F%2Fwww.stackoverflow.com' 

Now, if you compare my query_string with final_url format, you can observe two things

1) The order of the parameters is reversed, not data=()&search= is equal to search=()&data=

2) urlencode also encoded + in Generate+value

I believe the first change is due to random dictionary behavior. So, I use OrderedDict to undo the dictionary . Since, I am using python 2.6.5 , I did

 pip install ordereddict 

But I can not use it in my code when trying

 >>> od = OrderedDict((('a', 'first'), ('b', 'second'))) Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'OrderedDict' is not defined 

So my question is how to use OrderedDict in python 2.6.5 and how to make urlencode ignore + in Generate+value .

Also, this is the right approach to building URL .

+6
source share
3 answers

You do not have to worry about encoding + , which should be restored to the server after the request is canceled. The order of the named parameters does not matter either.

Given OrderedDict, it is not built in Python. You must import it from collections :

 from urllib import urlencode, quote # from urllib.parse import urlencode # python3 from collections import OrderedDict initial_url = "http://www.stackoverflow.com" search = "Generate+value" query_string = urlencode(OrderedDict(data=initial_url,search=search)) 

if your python is too old and does not have an OrderedDict in the collections module, use:

 encoded = "&".join( "%s=%s" % (key, quote(parameters[key], safe="+")) for key in ordered(parameters.keys())) 

In any case, the order of the parameters should not matter.

Pay attention to the safe quote parameter. This prevents escaping + , but that means the server will interpret Generate+value as Generate value . You can manually exit + by writing %2B and marking % as a safe char:

+15
source

Firstly, the order of the parameters in the HTTP request should be completely irrelevant. If this is not the case, then the parsing library on the side is doing something wrong.

Secondly, of course, + encoded. + used as a placeholder for a space in the encoded URL, so if yor raw string contains + , this should be escaped. urlencode expects an unencoded string, you cannot pass it a string that is already encoded.

+3
source

Some comments on the question and other answers:

  • If you want to keep order with urllib.urlencode , send an ordered sequence of k / v pairs instead of matching (dict). when you go into a dict, urlencode just calls foo.items() to grab the iterative sequence.

# urllib.urlencode accepts a mapping or sequence # the output of this can vary, because `items()` is called on the dict urllib.urlencode({"data": initial_url,"search": search}) # the output of this will not vary urllib.urlencode((("data", initial_url), ("search", search)))

you can also pass the second doseq argument to customize how iterative values ​​are handled.

  1. The order of the parameters does not matter. take these two urls for example:

    https://example.com?foo=bar&bar=foo https://example.com?bar=foo&foo=bar

    The http server must consider the order of these parameters, but there will be no function designed to compare URLs. To safely compare URLs, you need to sort these parameters.

    However, consider duplicate keys:

    https://example.com?foo=3&foo=2&foo=1

URI specifications support duplicate keys, but do not address priority or order.

In this application, they can run different results each time and be valid:

 https://example.com?foo=1&foo=2&foo=3 https://example.com?foo=1&foo=3&foo=2 https://example.com?foo=2&foo=3&foo=1 https://example.com?foo=2&foo=1&foo=3 https://example.com?foo=3&foo=1&foo=2 https://example.com?foo=3&foo=2&foo=1 
  • + is a reserved character that represents space in urlencoded form (vs %20 for part of the path). urllib.urlencode is executed using urllib.quote_plus() , not urllib.quote() . The OP most likely would just like to do this:

initial_url = "http://www.stackoverflow.com" search = "Generate value" urllib.urlencode((("data", initial_url), ("search", search)))

What produces:

data=http%3A%2F%2Fwww.stackoverflow.com&search=Generate+value

as a conclusion.

0
source

Source: https://habr.com/ru/post/916736/


All Articles