This suggestion is a bit more active, but if you control GetSubUrls , a more Pythonic approach might be to make it a generator that gives each URL as it gets. You can then handle each URL outside the function in a for loop. For example, I assume that GetSubUrls supposedly looks something like this:
def GetSubUrls(self, url): urls = [] document = openUrl(url) for stuff in document: urls.append(stuff) return urls
That is, it builds a list of URLs and returns the entire list. You can make it a generator:
def GetSubUrls(self, url): document = openUrl(url) for stuff in document: yield stuff
Then you can just do
for url in self.Scraper.GetSubUrls(url): self.UI.addlink(url[0], url[1])
This is the same as before, but if GetSubUrls is a generator, it does not wait to collect all suburls and then return them. It just gives one at a time, and your code can also process them one at a time.
One of the advantages of this method is that you can store the generator and use it whenever you want, instead of calls made inside GetSubUrls . That is, you can do urls = GetSubUrls(url) , save this for later, and still urls = GetSubUrls(url) on-demand URLs later when they are retrieved one by one. Using a callback forces the GetSubUrls function GetSubUrls process all URLs at once. Another advantage is that you do not need to create a bunch of small callbacks with a bit of content; instead, you can write these single-line lines naturally as the body of a for loop.
Read Python generators for more information on this (for example, What does the yield keyword do in Python? ).
source share