XHR request url does not exist while trying to parse its contents

Before I build a complete solution to my problem with Scrapy, I publish a simplified version of what I want to do:

import requests url = 'http://www.whoscored.com/stageplayerstatfeed/?field=1&isAscending=false&orderBy=Rating&playerId=-1&stageId=9155&teamId=32"' params = {'d': date.strftime('%Y%m'), 'isAggregate': 'false'} headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36'} response = requests.get(url, params=params, headers=headers) fixtures = response.body #fixtures = literal_eval(response.content) print fixtures 

This code says the above url does not exist. The URL refers to the XHR request that is sent when switching from the General tab to the Home tab of the main table on this page:

 http://www.whoscored.com/Teams/32/ 

If you enable XHR logging in the Google Tools for developers console, you can see both the XHR request and the response sent from the server in the form of a dictionary (which is the expected format).

Can someone tell me why the above code does not return the data that I expect to see?

thanks

+5
source share
1 answer

You have a few problems:

  • URL must be http://www.whoscored.com/stageplayerstatfeed
  • Invalid GET Parameters
  • important headings missing.
  • you need response.json() , not response.body

Fixed Version:

 import requests url = 'http://www.whoscored.com/stageplayerstatfeed' params = { 'field': '1', 'isAscending': 'false', 'orderBy': 'Rating', 'playerId': '-1', 'stageId': '9155', 'teamId': '32' } headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36', 'X-Requested-With': 'XMLHttpRequest', 'Host': 'www.whoscored.com', 'Referer': 'http://www.whoscored.com/Teams/32/'} response = requests.get(url, params=params, headers=headers) fixtures = response.json() print fixtures 

Print

 [ { u'AccurateCrosses': 0, u'AccurateLongBalls': 10, u'AccuratePasses': 89, u'AccurateThroughBalls': 0, u'AerialLost': 2, u'AerialWon': 4, ... }, ... ] 
+8
source

Source: https://habr.com/ru/post/1201778/


All Articles