One of the features that I like most about the Google Task Queue is its simplicity. In particular, I like that to start a task, you need a URL and some parameters, and then messages to this URL.
This structure means that tasks always execute the latest version of code. Conversely, my gear workers run the code inside my django project - so when I hit the new version live, I have to kill the old worker and run the new one so that it uses the current version of the code.
My goal is to ensure that the task queue is independent of the code base, so that I can push a new live version without restarting any workers. So, I thought: why not make the tasks executable by the URL in the same way as the task queue with the Google engine?
The process will work as follows:
- A user request arrives and launches several tasks that should not be blocked.
- Each task has a unique URL, so I set the POST relay task to the specified URL.
- The relay server finds the worker, passes the URL and sends the data to the worker.
- The employee simply sends the data url, completing the task.
Assume the following:
- Each request from the relay employee is somehow signed so that we know that it comes from the relay server, and not a malicious request.
- Tasks are limited to work in less than 10 seconds (there would be no long tasks that could wait time)
What are the potential pitfalls of this approach? Here bothers me:
- A server can potentially become clogged with multiple requests at the same time that are triggered by a previous request. Thus, a single user request can result in 10 simultaneous HTTP requests. I believe that I could have one worker with sleep before each speed limit request.
Any thoughts?
source share