There is no need to write this tool yourself, there are several good options.
make
make can do this quite easily, but it relies heavily on process control files. (If you want to run some operation on each input file that creates the output file, this can be surprising.) The -j command line will indicate the specified number of tasks, and the -l command-line parameter will indicate the average load on the system to be performed. before starting new tasks. (Which may be nice if you want to do some work βin the background.β Don't forget about the nice(1) command, which can also help here.)
So, a quick (and untested) Makefile for image conversion:
ALL=$(patsubst cimg%.jpg,thumb_cimg%.jpg,$(wildcard *.jpg)) .PHONY: all all: $(ALL) convert $< -resize 100x100 $@
If you run this with make , it will work one by one. If you run with make -j8 , it will do eight separate tasks. If you run make -j , it will start hundreds. (When compiling the source code, I find that a two-digit number of cores is a great starting point. This gives each processor something to do, waiting for disk I / O requests. Different machines and different loads may work differently.)
xargs
xargs provides the command line --max-procs . This is best if parallel processes can be split into a single input stream, either with the ascii NUL split input commands, or with new I / O commands. (Well, the -d option allows you to choose something else, but the two are common and easy.) This gives you the advantage of using find(1) syntax instead of writing funny expressions like the Makefile example above, or allowing your input Do not bind to files completely. (Think about whether you had a program for factoring large composite numbers into prime coefficients, which would make this task suitable for make inconvenient at best. xargs could do this easily.)
The previous example might look something like this:
find . -name '*jpg' -print0 | xargs -0 --max-procs 16 -I {} convert {} --resize 100x100 thumb_{}
parallel
The moreutils package (available at least on Ubuntu) provides the parallel command. It can be launched in two different ways: either execute the specified command for different arguments, or run different commands in parallel. The previous example might look like this:
parallel -i -j 16 convert {} -resize 100x100 thumb_{} -- *.jpg
beanstalkd
The beanstalkd program uses a completely different approach: it provides a message bus for sending requests, and task servers block the task, complete the tasks, and then return to waiting for a new task in the queue. If you want to write data back to the specific HTTP request that initiated the task, this may not be very convenient, since you must provide this mechanism yourself (possibly another "pipe" on the beanstalkd server), but if the end result sends data to a database or email or something similar asynchronously, it might be easiest to integrate into an existing application.