How can I make 25 requests simultaneously with HTTP :: Async in Perl?

I make a lot of HTTP requests and I selected HTTP :: Async to complete the job. I have more than 1000 requests, and if I just do the following (see the code below), many requests will have a timeout before they are processed, because it can take up to tens of minutes before processing takes them:

for my $url (@urls) { $async->add(HTTP::Request->new(GET => $url)); } while (my $resp = $async->wait_for_next_response) { # use $resp } 

So I decided to make 25 queries at a time, but I can't think of a way to express this in code.

I tried the following:

 while (1) { L25: for (1..25) { my $url = shift @urls; if (!defined($url)) { last L25; } $async->add(HTTP::Request->new(GET => $url)); } while (my $resp = $async->wait_for_next_response) { # use $resp } } 

This, however, is not so good, because now it is too slow. Now he waits until all 25 requests are processed, until he adds another 25. So if he has 2 requests left, he does nothing. I have to wait for all requests to be processed to add the next batch of 25.

How could I improve this logic to make $async do something while I process the records, but also make sure they are not timeout.

+6
source share
2 answers

You are close, you just need to combine the two approaches! :-)

Unconfirmed, so think of it as pseudo code. In particular, I'm not sure that total_count is the correct method of use, the documentation does not say. You may also have the counter $active_requests , which you ++ when adding a request and -- when you get a response.

 while (1) { # if there aren't already 25 requests "active", then add more while (@urls and $async->total_count < 25) { my $url = shift @urls; $async->add( ... ); } # deal with any finished requests right away, we wait for a # second just so we don't spin in the main loop too fast. while (my $response = $async->wait_for_next_response(1)) { # use $response } # finish the main loop when there no more work last unless ($async->total_count or @urls); } 
+2
source

If you cannot call wait_for_next_response fast enough because you are in the middle of executing another code, the easiest solution is to make the code interruptible by moving it to a separate execution thread. But if you start using streams, why use HTTP :: Async?

 use threads; use Thread::Queue::Any 1.03; use constant NUM_WORKERS => 25; my $req_q = Thread::Queue::Any->new(); my $res_q = Thread::Queue::Any->new(); my @workers; for (1..NUM_WORKERS) { push @workers, async { my $ua = LWP::UserAgent->new(); while (my $req = $req_q->dequeue()) { $res_q->enqueue( $ua->request($req) ); } }; } for my $url (@urls) { $req_q->enqueue( HTTP::Request->new( GET => $url ) ); } $req_q->enqueue(undef) for @workers; for ( 1..@urls ) { my $res = $res_q->dequeue(); ... } $_->join() for @workers; 
+2
source

Source: https://habr.com/ru/post/918838/


All Articles