Why WWW :: Mechanize GET specific pages, but not others?

I am new to Perl / HTML stuff. I am trying to use http://en.wikipedia.org/wiki/Periodic_table$mech->get($url) to get something from the periodic table , but it kept returning the error message as follows:

GETing error http://en.wikipedia.org/wiki/Periodic_table : Forbidden on PeriodicTable.pl 13

But it $mech->get($url)works fine if $url http://search.cpan.org/ .

Any help would be greatly appreciated!


Here is my code:

#!/usr/bin/perl -w

use strict;
use warnings;
use WWW::Mechanize;
use HTML::TreeBuilder;
my $mech = WWW::Mechanize->new( autocheck => 1 );

$mech = WWW::Mechanize->new();

my $table_url = "http://en.wikipedia.org/wiki/Periodic_table/";

$mech->get( $table_url );
+3
source share
2 answers

, Wikipedia User-Agent, .

"" -, get(), :

$mech->agent( 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_4; en-us) AppleWebKit/533.17.8 (KHTML, like Gecko) Version/5.0.1 Safari/533.17.8' );

URL- . , , .

( URL-).

WWW:: Mechanize - LWP:: UserAgent - . , agent().

. Wikipedia robots.txt. LWP:: UserAgent ( libwww) .

+10

, HTTP, , - . , Mech , Wikipedia . HTTP Scoop Mac.

+1

Source: https://habr.com/ru/post/1764283/


All Articles