Scrapy with selenium, webdriver not instantiating

I am trying to use selenium / phantomjs with scrapy and I am riddled with errors. For example, take the following code snippet:

def parse(self, resposne): while True: try: driver = webdriver.PhantomJS() # do some stuff driver.quit() break except (WebDriverException, TimeoutException): try: driver.quit() except UnboundLocalError: print "Driver failed to instantiate" time.sleep(3) continue 

Many times, when the driver seems to be unable to instantiate (therefore, the driver is unbound, hence the exception), and I get the ad unit (along with the print message that I am inserting)

 Exception AttributeError: "'Service' object has no attribute 'process'" in <bound method Service.__del__ of <selenium.webdriver.phantomjs.service.Service object at 0x7fbb28dc17d0>> ignored 

Trying to search, it seems, everything offers to update phantomjs, which I have ( 1.9.8 built from source code). Does anyone know what else could cause this problem and a suitable diagnosis?

+6
source share
3 answers

The reason for this behavior is how the PhantomJS Service class driver is implemented.

There is a __del__ method that calls the self.stop() method:

 def __del__(self): # subprocess.Popen doesn't send signal on __del__; # we have to try to stop the launched process. self.stop() 

And, self.stop() assumes that the service instance is still alive, trying to get attributes to it:

 def stop(self): """ Cleans up the process """ if self._log: self._log.close() self._log = None #If its dead dont worry if self.process is None: return ... 

Exactly the same problem is beautifully described in this thread:


What you should do is silently ignore the AttributeError when exiting the driver instance:

 try: driver.quit() except AttributeError: pass 

The problem was introduced by this version. This means lowering to 2.40.0 will also help.

+6
source

I had this problem because phantomjs was not accessible from the script (was not in the way). You can verify this by running phantomjs in the console.

+2
source

Selenium version 2.44.0 on pypi needs the next patch in Service.__init__ from selenium.webdriver.common.phantomjs.service

 self.process = None 

I was thinking of introducing a patch, but this already exists in the latest version in google code.

0
source

Source: https://habr.com/ru/post/980224/


All Articles