Django URLValidator created dummy errors

Question

Django URLValidator created dummy errors

I am using Django's URLValidator as follows in the form:

 def clean_url(self): validate = URLValidator(verify_exists=True) url = self.cleaned_data.get('url') try: logger.info(url) validate(url) except ValidationError, e: logger.info(e) raise forms.ValidationError("That website does not exist. Please try again.") return self.cleaned_data.get('url')

It seems to work with some url, but for some valid it fails. I was able to check with http://www.amazon.com/ the crash (which is clearly wrong). It runs from http://www.cisco.com/ . Is there a reason for false errors?

+9

python django

KVISH Aug 13 2018-12-18T00:

source share

1 answer

supervacuo · Accepted Answer · 2012-08-13 18:56

Check out the source URLValidator ; if you specify check_exists , it will go to the HEAD url to check if it is valid:

 req = urllib2.Request(url, None, headers) req.get_method = lambda: 'HEAD' ... opener.open(req, timeout=10)

Try making a HEAD request to Amazon yourself, and you will see the problem:

 carl@chaffinch:~$ HEAD http://www.amazon.com 405 MethodNotAllowed Date: Mon, 13 Aug 2012 18:50:56 GMT Server: Server Vary: Accept-Encoding,User-Agent Allow: POST, GET ...

I see no way to solve this problem other than fixing the monkey or otherwise extending the URLValidator to use a GET or POST ; Before doing this, you should carefully consider whether check_exists should be used at all (without which this problem should disappear). As core/validators.py itself says,

"The URLField verify_exists has fatal security and performance issues. Accordingly, it is deprecated."

You will find that the development version of Django has really completely utilized this feature.

Django URLValidator created dummy errors

More articles: