The spider should return Request, BaseItem, dict or None, get 'set'

I am trying to download images of all products from here . My spider looks like this:

from shopclues.items import ImgData
import scrapy
    class multipleImages(scrapy.Spider):
        name='multipleImages'
        start_urls=['http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera',]

        def parse (self, response):
            for url in response.css('div.products-grid div.grid-product):
                yield {
                ImgData(image_urls=[url.css('img::attr(src)').extract()])
                }

and items.py :

import scrapy
from scrapy.item import Item
class ShopcluesItem(scrapy.Item):
    # define the fields for your item here like:
    # name = scrapy.Field()
    pass

class ImgData(Item):
    image_urls=scrapy.Field()
    images=scrapy.Field()

But I get the following error when starting a spider:

2016-09-29 11:56:19 [scrapy] DEBUG: Crawled (200) <GET http://www.shopclues.com/robots.txt> (referer: None)
2016-09-29 11:56:20 [scrapy] DEBUG: Crawled (200) <GET http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera> (referer: None)
2016-09-29 11:56:20 [scrapy] ERROR: Spider must return Request, BaseItem, dict or None, got 'set' in <GET http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera>
2016-09-29 11:56:20 [scrapy] ERROR: Spider must return Request, BaseItem, dict or None, got 'set' in <GET http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera>
2016-09-29 11:56:20 [scrapy] ERROR: Spider must return Request, BaseItem, dict or None, got 'set' in <GET http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera>
2016-09-29 11:56:20 [scrapy] ERROR: Spider must return Request, BaseItem, dict or None, got 'set' in <GET http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera>
2016-09-29 11:56:20 [scrapy] ERROR: Spider must return Request, BaseItem, dict or None, got 'set' in <GET http://www.shopclues.com/electronic-accessories-8/cameras-18/cameras-special.html?search=1&q1=camera>

What does this error mean? What are the possible causes of the error?

+4
source share
2 answers

Pass the list of URLs into the pipeline.

 def parse (self, response):
     images = ImgData()
     images['image_urls']=[] 
     for url in response.css('div.products-grid div.grid-product):
         images['image_urls'].append(url.css('img::attr(src)').extract_first())
     yield images
+4
source

{} - python . , . {a, b, c, d} < - - , {a: b, c: d} < - that a dict.

:

yield {
    ImgData(image_urls=[url.css('img::attr(src)').extract()])
}

, ?

yield {
    'images': ImgData(image_urls=[url.css('img::attr(src)').extract()]),
}
+3

Source: https://habr.com/ru/post/1656173/


All Articles