Python: request.get, iterative url in loop

I am trying to get information from stats.nba.com by iterating request.get (url) in a for loop, where url is changing at each iteration. If I just repeat it as soon as it works, but two or more seems to give errors, and I'm not sure why. I am new to programming, so any information would be helpful. Thank you in advance. Here is my code:

import requests import json team_id = 1610612737 def get_data(url): response = requests.get(url) if response.status_code == 200: data = response.json() return data else: print(response.text) print(response.status_code) for i in range(30): # 30 NBA Teams base_url = "http://stats.nba.com/stats/teamdetails?teamID=" team_url = base_url + str(team_id) data = get_data(team_url) ## Do stuff ## team_id +=1 

If I do "for me in range (1):" it works, but I get status_code = 400 for each iteration if the range is greater than 1. Thanks for the help!

+5
source share
1 answer

The website limits the number of requests per second, so you need to include certain request headers or put a delay in your script (the first option is the fastest and probably the most reliable of the two).

Header Method:

 ''' add under team_id = 1610612737 ''' HEADERS = {'user-agent': ('Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)' 'AppleWebKit/537.36 (KHTML, like Gecko)' 'Chrome/45.0.2454.101 Safari/537.36'), 'referer': 'http://stats.nba.com/scores/'} 

Then add this to your get answer:

 response = requests.get(url, headers=HEADERS) 

* You do not need to delay your script at all if you use this method.

Delay Method:

 import time time.sleep(10) # delays for 10 seconds (put in your loop) 

Strike> It seems like skipping or skipping using a delay, so I would not recommend using it if absolutely necessary.

+3
source

Source: https://habr.com/ru/post/1247906/


All Articles