How to use a tornado with APScheduler?

I run python apscheduler and periodically want to do some POST-ing work on some http resources, which will include using the AsyncHttpClient tornado as a scheduled task. Each work will perform several POST. When each HTTP request responds, then a callback is called (I think Tornado uses future to accomplish this).

I am worried about thread safety here, since apscheduler runs jobs on different threads. I could not find a well-explained example of how tornadoes are best used in multiple threads in this context.

How can I best use apscheduler with tornado this way?

Specific problems:

  • What tornado should i use? The docs say that AsyncHttpClient "works like magic." Well, magic scares me. Do I need to use AsyncHttpClient from the current thread, or can I use the main one (you can specify it)?

  • Are there any thread safety issues with my callback that I am using?

  • It is not clear what happens when the thread terminates, but there is still a pending callback / future that needs to be called. Is there a problem here?

  • Since apscheduler starts as threads in a process, and python has a GIL, is it almost the same as one IOLoop from the main thread - unlike several loops from different threads (relative to performance)?

+4
source share
1 answer
  • All Tornado utilities work around Tornado IOLoop - this also includes AsyncHTTPClient. And IOLoop is not considered thread safe. Therefore, you should not run AsyncHTTPClient from any thread except the thread on which your main IOLoop is running. For more on how to use IOLoop, read this .

  • If you use tornado.ioloop.IOLoop.instance() , then I suggest that if you intend not to add callbacks to the main IOLoop stream. You can use tornado.ioloop.IOLoop.current() to correctly reference the correct instance of IOLoop for the correct stream. And you will have to do too much accounting to add a callback to the second IOLoop stream from another non-main IOLoop stream - it just gets too dirty.

  • I do not quite understand this. But, as I understand it, there are two scenarios. Either you're talking about streaming with IOLoop, or without IOLoop. If the thread does not start IOLoop, then after the thread reaches completion, any callback must be made by IOLoop in some other thread (possibly in the main thread). Another scenario is that the thread you are talking about has an IOLoop start. Then the thread will not complete unless you stop IOLoop. And so the execution of the callback will really depend on when you stop IOLoop.

  • Honestly, I don’t see much point in using streams with Tornado. There will be no performance gain if you do not work on PyPy, and I'm not sure that Tornado will play well (not all things are known to work on this, and frankly, I don’t know about Tornado either). You can also have several processes of your Tornado application, if it is a web server, and use Nginx as a proxy server and LB. Since you entered apscheduler , I would suggest using IOLoop add_timeout , which does almost the same thing you need, and it is native to Tornadoes, which play with it much nicer. Callbacks are very difficult to debug anyway. Combine it with Python threads and you can have a huge mess. If you are ready to consider another option, just move all asynchronous processing from this process - this will greatly facilitate life. Think of something like Celery for this.

+1
source

Source: https://habr.com/ru/post/1479273/


All Articles