Cookie authorization authorization

Question

Cookie authorization authorization

I am new to scrapy and decided to try it because of the good online reviews. I am trying to log in using scrapy. I successfully logged in with a combination of selenium and mechanization, collecting the necessary cookies with selenium and adding them for mechanization. Now I am trying to do something similar with scrapy and selenium, but it seems that it cannot get something to work. I can’t even tell if something is working or not. Can someone help me please. What follows is what Eve began. I may not even need to transfer cookies using scrapy, but I can’t say whether a thing is ever logged in or not. Thanks

from scrapy.spider import BaseSpider from scrapy.http import Response,FormRequest,Request from scrapy.selector import HtmlXPathSelector from selenium import webdriver class MySpider(BaseSpider): name = 'MySpider' start_urls = ['http://my_domain.com/'] def get_cookies(self): driver = webdriver.Firefox() driver.implicitly_wait(30) base_url = "http://www.my_domain.com/" driver.get(base_url) driver.find_element_by_name("USER").clear() driver.find_element_by_name("USER").send_keys("my_username") driver.find_element_by_name("PASSWORD").clear() driver.find_element_by_name("PASSWORD").send_keys("my_password") driver.find_element_by_name("submit").click() cookies = driver.get_cookies() driver.close() return cookies def parse(self, response,my_cookies=get_cookies): return Request(url="http://my_domain.com/", cookies=my_cookies, callback=self.login) def login(self,response): return [FormRequest.from_response(response, formname='login_form', formdata={'USER': 'my_username', 'PASSWORD': 'my_password'}, callback=self.after_login)] def after_login(self, response): hxs = HtmlXPathSelector(response) print hxs.select('/html/head/title').extract()

+7

python authentication login selenium scrapy

JonDog Jun 26 2018-12-12T00:

source share

1 answer

warvariuc · Accepted Answer · 2012-06-26 05:55

Your question is more related to the debugging problem, so my answer will only have some comments on your question, not the exact answer.

 def parse(self, response,my_cookies=get_cookies): return Request(url="http://my_domain.com/", cookies=my_cookies, callback=self.login)

my_cookies=get_cookies - here you assign a function, not the result that it returns. I think you do not need to pass any function here as a parameter at all. It should be:

 def parse(self, response): return Request(url="http://my_domain.com/", cookies=self.get_cookies(), callback=self.login)

cookies argument for Request must be a dict - please make sure it is a dict.

I can’t even tell if something is working or not.

Put some fingerprints in callbacks to complete the execution.

Cookie authorization authorization

More articles: