How to get multiple processes to read the same documents

My scenario is that I have a collection consisting of many documents that need to be processed - one document at a time. It takes a relatively long time to process a document, and it takes many hours to process the entire collection. Therefore, I will have several simultaneous "workers" processing one collection. Everyone needs to do something

(A) receive the following unprocessed document,

(B) process it,

(C) mark the document as processed and continue.

How to ensure that simultaneous processes do not read the same documents? I don’t know what the key values ​​will be, so I can’t say something like process_A, which should start with 1 and process_B should start with a million. I would also like to add as many processes as manageable, so it is impractical to say that one goes forward and the other goes backward.

I ask about MongoDB because this is what I use. I assume the same question can be asked about the SQL database.

I beg everyone who wants to help, and not focus on changing the script, which for some external reason is data.

thanks

+4
source share
1 answer

. , _id . , , , .

, Mongo . , " ". , , _id , 1 , .

0

Source: https://habr.com/ru/post/1617534/


All Articles