Since nothing works, I started a new project with
python scrapy-ctl.py startproject Nu
I definitely followed the tutorial and created folders, and a new spider
from scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor from scrapy.selector import HtmlXPathSelector from scrapy.item import Item from Nu.items import NuItem from urls import u class NuSpider(CrawlSpider): domain_name = "wcase" start_urls = ['http://www.whitecase.com/aabbas/'] names = hxs.select('//td[@class="altRow"][1]/a/@href').re('/.a\w+') u = names.pop() rules = (Rule(SgmlLinkExtractor(allow=(u, )), callback='parse_item'),) def parse(self, response): self.log('Hi, this is an item page! %s' % response.url) hxs = HtmlXPathSelector(response) item = Item() item['school'] = hxs.select('//td[@class="mainColumnTDa"]').re('(?<=(JD,\s))(.*?)(\d+)') return item SPIDER = NuSpider()
and when i started
C:\Python26\Scripts\Nu>python scrapy-ctl.py crawl wcase
I get
[Nu] ERROR: Could not find spider for domain: wcase
Other spiders are at least recognized by Scrapy, this is not. What am I doing wrong?
Thanks for your help!
source share