Is it possible to parse JSON with Goutte?

I am working on crawling websites and there is no problem parsing HTML with Goutte so far. But I need to get JSON from the website and because of cookie management, I don't want to do this with file_get_contents() - this does not work.

I can do with pure cURL, but in this case I just want to use Goutte and don't want to use any other library.

So, is there any method that I can parse only text through Goutte, or do I really need to do this with the good old methods?

 /* Sample Code */ $client = new Client(); $crawler = $client->request('foo'); $crawler = $crawler->filter('bar'); // of course not working 

Thanks.

+4
source share
4 answers

After a very deep search inside the Goutte libraries, I found a way and I wanted to share it. Because Goutte is a really powerful library, but there is so much documentation.

Parsing JSON via (Goutte> Guzzle)

Just get the desired output page and save json to an array.

 $client = new Client(); // Goutte Client $request = $client->getClient()->createRequest('GET', 'http://***.json'); /* getClient() for taking Guzzle Client */ $response = $request->send(); // Send created request to server $data = $response->json(); // Returns PHP Array 

JSON parsing with cookies via (Goutte + Guzzle) - for authentication

Send a request to one of the pages of the site (the main page looks better) to receive cookies, and then use these cookies for authentication.

 $client = new Client(); // Goutte Client $crawler = $client->request("GET", "http://foo.bar"); /* Send request directly and get whole data. It includes cookies from server and it automatically stored in Goutte Client object */ $request = $client->getClient()->createRequest('GET', 'http://foo.bar/baz.json'); /* getClient() for taking Guzzle Client */ $cookies = $client->getRequest()->getCookies(); foreach ($cookies as $key => $value) { $request->addCookie($key, $value); } /* Get cookies from Goutte Client and add to cookies in Guzzle request */ $response = $request->send(); // Send created request to server $data = $response->json(); // Returns PHP Array 

Hope this helps. Because I almost spent 3 days to understand Gouttle and its components.

+14
source

I realized this after several hours of searching, just do the following:

 $client = new Client(); // Goutte Client $crawler = $client->request("GET", "http://foo.bar"); $jsonData = $crawler->text(); 
+2
source
Decision

mithataydogmus did not work for me. I created a new class "BetterClient":

 use Goutte\Client as GoutteClient; class BetterClient extends GoutteClient { private $guzzleResponse; public function getGuzzleResponse() { return $this->guzzleResponse; } protected function createResponse($response) { $this->guzzleResponse = $response; return parent::createResponse($response); } } 

Using:

 $client = new BetterClient(); $request = $client->request('GET', $url); $data = $client->getGuzzleResponse()->json(); 
+1
source

I could also get JSON with:

 $client->getResponse()->getContent()->getContents() 
+1
source

Source: https://habr.com/ru/post/1501502/


All Articles