ReactorNotRestartable - Twisted and Violin

Before linking me to the other answers related to this, please note that I read them and am still a bit confused. Ok, here we go.

So, I am creating a webapp in Django. I am importing the latest scrapy library to crawl a website. I do not use celery (I know very little about it, but saw it in other topics related to this).

One of the URLs on our site, / crawl /, is for launching the crawler. This is the only url on our site that requires scrapy. Here is the function called when visiting the URL:

def crawl(request):
  configure_logging({'LOG_FORMAT': '%(levelname)s: %(message)s'})
  runner = CrawlerRunner()

  d = runner.crawl(ReviewSpider)
  d.addBoth(lambda _: reactor.stop())
  reactor.run() # the script will block here until the crawling is finished

  return render(request, 'index.html')

You will notice that this is an adaptation of a treatment textbook on their website. When you first visit this URL when starting the server, everything works as intended. The second time on, a ReactorNotRestartable exception is thrown. I understand that this exception occurs when a command is issued in a reactor that has already been shut down to start again, which is not possible.

If you look at the sample code, I would suggest that the string "runner = CrawlerRunner ()" will return a new ~ reactor to use every time this URL is visited. But I believe that perhaps my understanding of twisted reactors is not entirely clear.

How can I start and start a new reactor every time I visit this URL?

Thank you very much

0
1

, . . , , , ​​ , .

, Crochet ( - ) .

:

#!/usr/bin/python
"""
Do a DNS lookup using Twisted APIs.
"""
from __future__ import print_function

# The Twisted code we'll be using:
from twisted.names import client

from crochet import setup, wait_for
setup()


# Crochet layer, wrapping Twisted DNS library in a blocking call.
@wait_for(timeout=5.0)
def gethostbyname(name):
    """Lookup the IP of a given hostname.

    Unlike socket.gethostbyname() which can take an arbitrary amount of time
    to finish, this function will raise crochet.TimeoutError if more than 5
    seconds elapse without an answer being received.
    """
    d = client.lookupAddress(name)
    d.addCallback(lambda result: result[0][0].payload.dottedQuad())
    return d


if __name__ == '__main__':
    # Application code using the public API - notice it works in a normal
    # blocking manner, with no event loop visible:
    import sys
    name = sys.argv[1]
    ip = gethostbyname(name)
    print(name, "->", ip)

gethostbyname, Twisted API. twisted.names.client, .

: reactor.run reactor.stop - setup.

0

Source: https://habr.com/ru/post/1665936/


All Articles