I use XMLFeedSpider in Scrapy to opt out of a real estate site.
Each url request made by my spider (via start_urls) returns an XML page with a bunch of ads and a link to the next page (search results are limited to 50 ads).
I was wondering how can I add this extra page as a new request in my spider?
I searched stackoverflow for a while, but I just can't find a simple answer to my problem!
Below is the code that I have on my spider. I updated it using the parse_nodes () method mentioned by Paul, but the following URL was not found for some reason.
Can I provide additional requests in the adapt_response method?
from scrapy.spider import log from scrapy.selector import XmlXPathSelector from scrapy.contrib.spiders import XMLFeedSpider from crawler.items import RefItem, PicItem from crawler.seloger_helper import urlbuilder from scrapy.http import Request class Seloger_spider_XML(XMLFeedSpider): name = 'Seloger_spider_XML' allowed_domains = ['seloger.com'] iterator = 'iternodes'
Thanks. Gilles
source share