I have Nworkers who need to process incoming batches of data. Each worker is set up so that he knows that this is the "worker Xof N."
Each incoming batch of data has a random unique ID(random, it is evenly distributed) and has a different size; processing time is proportional to size. Size can vary greatly.
When a new batch of data is available, it immediately appears as available to all N workers, but I want only one to process it, without coordination between them . Right now, every employee calculates ID % N == X, and it’s true, the worker assigns the package himself, and the rest pass it. This works correctly and ensures that on average each worker processes the same number of batches. Unfortunately, it does not take into account the size of the batch, so some workers can finish processing much later than others, since there may be problems with assigning very large tasks.
How can I change the algorithm so that each worker assigns batches himself so as to take into account the lot size, so that on average each worker will independently assign the same total work size (from different batches)?
source
share