Python Mechanize will not open these sites

I am working with the Python Mechanize module. I came across three different sites that cannot be opened mechanized directly:

Adding the following code allows you to mechanize opening and analyzing the Wikipedia article and google search results:

  br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')] 

But my workarounds are not suitable for the CPSC.gov site - when I try to open it using the Mechanize Browser mechanism, my python freezes - to the point that I can't even turn it off.

What's going on here?

+6
source share
1 answer

In the case of the cpsc.gov site, it looks like a refresh header, which is incorrectly handled by the HTTPRefreshProcessor mechanization. However, you can solve this problem as follows:

 import mechanize url = 'http://www.cpsc.gov/cpscpub/prerel/prhtml03/03059.html' br = mechanize.Browser() br.set_handle_refresh(False) br.open(url) 
+14
source

Source: https://habr.com/ru/post/903876/


All Articles