Perl - HTTP :: XHR / JSON Proxy Capture

The site http://openbook.etoro.com/#/main/ has a live channel that is generated by javascript through XHR keep-alive requests and receiving responses from the server as gzip JSON compression.

I want to capture the channel to a file.

The usual way (WWW :: Mech ..) is (probably) unviable because the need to develop reverese of all Javascripts on the page and simulate the browser is really a difficult task, therefore, looking for an alternative solution.

My idea uses Man-in-the-middle tactics, so broswser will do the job, and I want to grab the connection through the perl proxy dedicated to this task only.

I can catch the initial message, but not the channel itself. The proxy works fine, because only my filter works in the browser.

use HTTP::Proxy; use HTTP::Proxy::HeaderFilter::simple; use HTTP::Proxy::BodyFilter::simple; use Data::Dumper; use strict; use warnings; my $proxy = HTTP::Proxy->new( port => 3128, max_clients => 100, max_keep_alive_requests => 100 ); my $hfilter = HTTP::Proxy::HeaderFilter::simple->new( sub { my ( $self, $headers, $message ) = @_; print STDERR "headers", Dumper($headers); } ); my $bfilter = HTTP::Proxy::BodyFilter::simple->new( filter => sub { my ( $self, $dataref, $message, $protocol, $buffer ) = @_; print STDERR "dataref", Dumper($dataref); } ); $proxy->push_filter( response => $hfilter); #header dumper $proxy->push_filter( response => $bfilter); #body dumper $proxy->start; 

Firefox is configured using the above proxy for all posts.

The channel works in the browser, so the proxy server feeds it with data. (When I stop the proxy, the feed also stops). By accident (I can not understand when) I get the following error:

 [Tue Jul 10 17:13:58 2012] (42289) ERROR: Getting request failed: Client closed 

Can someone show me a way how to construct the correct HTTP :: Proxy filter for Dumper all communication between the browser and server aspects of keep_alive XHR?

+6
source share
2 answers

Here is something that I think is doing what you need:

 #!/usr/bin/perl use 5.010; use strict; use warnings; use HTTP::Proxy; use HTTP::Proxy::BodyFilter::complete; use HTTP::Proxy::BodyFilter::simple; use JSON::XS qw( decode_json ); use Data::Dumper qw( Dumper ); my $proxy = HTTP::Proxy->new( port => 3128, max_clients => 100, max_keep_alive_requests => 100, ); my $filter = HTTP::Proxy::BodyFilter::simple->new( sub { my ( $self, $dataref, $message, $protocol, $buffer ) = @_; return unless $$dataref; my $content_type = $message->headers->content_type or return; say "\nContent-type: $content_type"; my $data = decode_json( $$dataref ); say Dumper( $data ); } ); $proxy->push_filter( method => 'GET', mime => 'application/json', response => HTTP::Proxy::BodyFilter::complete->new, response => $filter ); $proxy->start; 

I don’t think you need a separate header filter, because you can access any headers you want to see using $message->headers in the body filter.

You will notice that I pressed two filters on the conveyor. The first one is of type HTTP::Proxy::BodyFilter::complete , and its task is to collect the pieces of the response and make sure that the current filter that follows it always receives the full message in $dataref . However, for a foreach chart that has been received and buffered, the next filter will be called and pass an empty $dataref . My filter ignores them, returning earlier.

I also installed a filter pipeline to ignore everything except the GET requests that led to JSON responses, as they seem to be the most interesting.

Thank you for asking this question - it was an interesting little problem, and you seem to have done most of the hard work.

+5
source

Set the mime parameter, by default it is used only for filtering text types.

 $proxy->push_filter(response => $hfilter, mime => 'application/json'); $proxy->push_filter(response => $bfilter, mime => 'application/json'); 
+2
source

Source: https://habr.com/ru/post/920138/


All Articles