Active perl Mechanize error with bad request

I am trying to get https using Mechanize, but with an error:

use strict; use warnings; use IO::Socket::SSL; use WWW::Mechanize; my $mech = WWW::Mechanize->new; $mech->proxy(['https','http'], 'http://proxy:8080/'); $mech->get('https://www.google.com'); print $mech->content; 

Error:

 Error GETing https://www.google.com: Bad Request at perl4.pl line 9. 

When I try to use LWP::UserAgent , I can get https without error:

 use LWP::UserAgent; my $ua = LWP::UserAgent->new; $ua->proxy(['https','http'], 'http://proxy:8080/'); $ua->get('https://www.google.com'); 

Can anyone help with this?

Currently using Mechanize 1.72

+4
source share
3 answers

I installed LWP-Protocol-connect-6.03 and connect to the proxy using

 $https_proxy = 'connect://proxy:8080/'; 

Now it works fine: D

0
source

WWWW :: Mechanize is based on LWP :: UserAgent, which for many years has the strange idea of ​​https proxy requests, for example. instead of using the CONNECT request to create the tunnel and then switch to SSL, it sends a GET request with the https URL. See https://rt.cpan.org/Ticket/Display.html?id=1894

Commit is finally merged into the libwwww-perl github repository, but I don't know when the new version of LWP will be released. In the meantime, you can use Net :: SSLGlue :: LWP, which monkey fixes LWP to provide proper support for the https proxy server (I am the author of Net :: SSLGlue :: LWP and fixes for LWP).

+4
source

I would suggest that based on the error, you indicated that your proxy blocks a specific User-Agent. The HTTP user agent used by LWP :: UserAgent is different from the WWW :: Mechanize interface.

I suggest trying this line:

 my $mech = WWW::Mechanize->new( agent => 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36' ); 

This will force the proxy server and the receiving server to believe that you are a Chrome browser and not some crawler / malware / virus / etc

Another suggestion is to make a data damper on the $ mech element and confirm that it is "inside":

 use Data::Dumper; print Dumper($mech); 

You can also use the same method to delete the contents of $mech after calling the get() function.

I'm not sure that this is relevant, but note that not all Proxies support HTTPS / SSL, only those that allow proxy proxy / proxy connection through CONNECT will allow you HTTPS / SSL proxy traffic.

0
source

Source: https://habr.com/ru/post/1501415/


All Articles