How to gracefully close coroutines with Ctrl + C?

I am writing a spider to crawl web pages. I know asincio, maybe my best bet. Therefore, I use coroutines to process work asynchronously. Now I am scratching my head on how to exit the program by interrupting the keyboard. After completion of all work, the program may be closed. The source code can be run in python 3.5 and is listed below.

import asyncio import aiohttp from contextlib import suppress class Spider(object): def __init__(self): self.max_tasks = 2 self.task_queue = asyncio.Queue(self.max_tasks) self.loop = asyncio.get_event_loop() self.counter = 1 def close(self): for w in self.workers: w.cancel() async def fetch(self, url): try: async with aiohttp.ClientSession(loop = self.loop) as self.session: with aiohttp.Timeout(30, loop = self.session.loop): async with self.session.get(url) as resp: print('get response from url: %s' % url) except: pass finally: pass async def work(self): while True: url = await self.task_queue.get() await self.fetch(url) self.task_queue.task_done() def assign_work(self): print('[*]assigning work...') url = 'https://www.python.org/' if self.counter > 10: return 'done' for _ in range(self.max_tasks): self.counter += 1 self.task_queue.put_nowait(url) async def crawl(self): self.workers = [self.loop.create_task(self.work()) for _ in range(self.max_tasks)] while True: if self.assign_work() == 'done': break await self.task_queue.join() self.close() def main(): loop = asyncio.get_event_loop() spider = Spider() try: loop.run_until_complete(spider.crawl()) except KeyboardInterrupt: print ('Interrupt from keyboard') spider.close() pending = asyncio.Task.all_tasks() for w in pending: w.cancel() with suppress(asyncio.CancelledError): loop.run_until_complete(w) finally: loop.stop() loop.run_forever() loop.close() if __name__ == '__main__': main() 

But if I press 'Ctrl + C' while it is running, some strange errors may occur. I mean, sometimes the program can be closed "Ctrl + C" gracefully. There is no error message. However, in some cases, the program will continue to work after pressing "Ctrl + C" and will not stop until all work has been completed. If at this moment I press 'Ctrl + C', "The task was destroyed, but it was postponed!" will be there.

I read several topics about asyncio and add code to main () to close coroutines gracefully. But that will not work. Does anyone have similar problems?

+5
source share
2 answers

I am sure the problem is here:

 except: pass 

You should never do such a thing. And your situation is another example of what might happen otherwise.

When you cancel a task and expect it to be canceled, asyncio.CancelledError raised inside the task and should not be suppressed anywhere inside. The line in which you expect your task to be canceled should cause this exception, otherwise the task will continue.

That's why you do

 task.cancel() with suppress(asyncio.CancelledError): loop.run_until_complete(task) # this line should raise CancelledError, # otherwise task will continue 

to actually cancel the task.

Upd:

But I still don’t understand why the source code can go away well "Ctrl + C" with an undetermined probability?

Dependence of the status of your tasks:

  • If at the moment you press "Ctrl + C", all tasks are executed, and not they are waiting for a CancelledError , and your code will finish normally.
  • If at the moment you press "Ctrl + C", some tasks are expected, but close to the completion of their execution, your code will linger a bit on the cancellation of tasks and end when the tasks are completed shortly after it.
  • If at the moment you press "Ctrl + C", some tasks will be delayed and far from finished, your code will get stuck trying to cancel these tasks (which cannot be done). Another "Ctrl + C" will interrupt the cancellation process, but the tasks will not be canceled or completed, and you will receive a warning "The task was destroyed, but it is not ready!".
+3
source

I assume you use any flavor of Unix; if not, my comments may not be appropriate for your situation.

Pressing Ctrl - C in the terminal sends all processes associated with this tty with a SIGINT signal. The Python process catches this Unix signal and translates it into throwing a KeyboardInterrupt exception. In a streaming application (I'm not sure that async stuff uses internal streams, but this is very similar to what it does), as a rule, only one stream (main stream) receives this signal and thus reacts in this way. If it is not prepared specifically for this situation, it will end due to an exception.

The threading manager will then wait for the remaining threads of the supported threads to complete before the Unix process completes with the exit code. This can take quite some time. See this question about killing fellow travelers and why this is not possible at all.

What you want to do, I believe, immediately destroys your process, killing all threads in one step.

The easiest way to achieve this is to press Ctrl - \ . This will send SIGQUIT instead of SIGINT , which typically affects other threads and causes them to terminate.

If this is not enough (because for some reason you need to respond correctly to Ctrl - C ), you can send yourself a signal:

 import os, signal os.kill(os.getpid(), signal.SIGQUIT) 

This should interrupt all current threads unless they especially catch SIGQUIT , in which case you can still use SIGKILL to perform a hard kill on them. However, this prevents them from reacting and can lead to problems.

0
source

Source: https://habr.com/ru/post/1270521/


All Articles