How can I do WWW: Mechanize so as not to retrieve pages twice?

I have a web page cleaning application written in OO Perl. It uses a single WWW :: Mechanize the object used in the application. How can I make it not get the same url twice, i.e. Make a second get()with the same no-op url:

my $mech = WWW::Mechanize->new();
my $url = 'http:://google.com';

$mech->get( $url ); # first time, fetch
$mech->get( $url ); # same url, do nothing
+3
source share
3 answers

You can subclass WWW::Mechanizeand override the method get()to do what you want:

package MyMech;
use base 'WWW::Mechanize';

sub get {
    my $self = shift;
    my($url) = @_;

    if (defined $self->res && $self->res->request->uri ne $url) {
        return $self->SUPER::get(@_)
    }
    return $self->res;
}
+2
source

See WWW :: Mechanize :: Cached :

Summary

use WWW::Mechanize::Cached;

my $cacher = WWW::Mechanize::Cached->new;
$cacher->get( $url );

Description

Cache::Cache . .

+7

You can store URLs and their contents in a hash.

my $mech = WWW::Mechanize->new();
my $url = 'http://google.com';
my %response;

$response{$url} = $mech->get($url) unless $response{$url};
+4
source

Source: https://habr.com/ru/post/1738472/


All Articles