Unique jobs with kue for node.js

Question

Unique jobs with kue for node.js

I would like jobs.create fail if an identical job is already on the system. Is there any way to do this?

I need to run the same task every 24 hours, but some tasks can take more than 24 hours, so I need to be sure that this task has not yet been in the system (active, does not work in the queue) before adding it.

UPDATED : Well, I'm going to simplify the problem to explain it here. Suppose I have an analytics service and I have to send a report to my users once a day. Completing these reports several times (just a few cases, but this is an opportunity), take several hours even more than one day.

I need to know what current tasks are being performed to avoid duplication of tasks. I could not find anything in the API '' 'kue' '' a to find out which jobs are currently working. I also need some kind of event when more tasks are needed, and then call your producer getMoreJobs .

Maybe my approach is wrong, if so, please let me know the best way to solve my problem.

This is my simplified code:

 var kue = require('kue'), cluster = require('cluster'), numCPUs = require('os').cpus().length; numCPUs = CONFIG.sync.workers || numCPUs; var jobs = kue.createQueue(); if (cluster.isMaster) { console.log('Starting master pid:' + process.pid); jobs.on('job complete', function(id){ kue.Job.get(id, function(err, job){ if (err || !job) return; job.remove(function(err){ if (err) throw err; console.log('removed completed job #%d', job.id); }); }); function getMoreJobs() { console.log('looking for more jobs...'); getOutdateReports(function (err, reports) { if (err) return setTimeout(getMoreJobs, 5 * 60 * 60 * 1000); reports.forEach(function(report) { jobs.create('reports', { id: report.id, title: report.name, params: report.params }).attempts(5).save(); }); setTimeout(getMoreJobs, 60 * 60 * 1000); }); } //Create the jobs getMoreJobs(); console.log('Starting ', numCPUs, ' workers'); for (var i = 0; i < numCPUs; i++) { cluster.fork(); } cluster.on('death', function(worker) { console.log('worker pid:' + worker.pid + ' died!'.bold.red); }); } else { //Process the jobs console.log('Starting worker pid:' + process.pid); jobs.process('reports', 20, function(job, done){ //completing my work here veryHardWorkGeneratingReports(function(err) { if (err) return done(err); return done(); }); }); }

+4

javascript node.js parallel-processing

aartiles Jan 27 '12 at 11:17

source share

2 answers

The answer to one of your questions is that Kue puts the jobs it pushes out of the redis queue into "active" and you will never get them if you don't find them.

The answer to another question is that your distributed work queue is a consumer, not a job producer. Mixing them as if everything is fine with you, but this is a muddy paradigm. What I did with Kue is to make a shell for kue json api so that the job can be queued from anywhere in the system. Since you seem to need to work with a shovel, I suggest writing a separate manufacturer application that does nothing but receive external tasks and insert them into the Kue work queue. It can keep track of the work queue for when jobs are running at a low level, and load the package or what I do is to make it match jobs as fast as I can, and unwind multiple instances of your consumer application to handle the load faster .

Repeat repeat: your separation of concerns is not very good here. You must have a task maker that is completely separate from your task application. This gives you more flexibility, ease of scaling (just run another consumer on a different machine, and you scale!) And overall ease of code management. You should also allow, if possible, those who provide you with these tasks that you are “looking for” to access your Kue JSON api server instead of going out and finding them. An employer can plan their own tasks with Kue.

+3

Nathan C. Tresch May 08 '12 at 10:17

source share

Teemu · Accepted Answer · 2012-01-27T14:49:47+0000

Take a look at https // github.com / LearnBoost / kue .

In json.js script check lines 64-112. There you will find methods that return an object containing tasks that are also filtered with type, state, or identifier. ( jobRange() , jobStateRange() , jobTypeRange() .)

Scrolling through the main page to the JSON API section, you will find examples of returned objects.

How to call and use those methods that you know much better than me.

jobs.create() will fail if you pass an unknown keyword. I would create a function to check the current job in forEach -loop and return the keyword. Then simply call this function instead of the literal keyword in the jobs.create() parameters.

The information obtained using these methods in json.js can help you create this "moreJobToDo" -event too.

Unique jobs with kue for node.js

More articles: