I want to parse some information from a website that has data spread across multiple pages.
The problem is that I do not know how many pages there are. Maybe 2, but maybe 4, or even one page.
How can I iterate over pages when I don’t know how many pages there will be?
I know, however, a URL pattern that looks something like the code below.
In addition, the page names are not prime numbers, but they are in 'pe2'for page 2 and 'pe4'for page 3, etc., so they cannot just go through a range (number).
This dummy code for the loop I'm trying to fix.
pages=['','pe2', 'pe4', 'pe6', 'pe8',]
import requests
from bs4 import BeautifulSoup
for i in pages:
url = "http://www.website.com/somecode/dummy?page={}".format(i)
r = requests.get(url)
soup = BeautifulSoup(r.content)