MongoDb Streaming Out Entered data in real time (or in real time)

I have several MongoDB collections that take multiple JSON documents from various streaming sources. In other words, there are a number of processes that constantly insert data into the MongoDB collection of collections.

I need a way to transfer data from MongoDB to applications located downstream. So I want the system to look like this:

App Stream1 --> App Stream2 --> MONGODB ---> Aggregated Stream App Stream3 --> 

Or that:

 App Stream1 --> ---> MongoD Stream1 App Stream2 --> MONGODB ---> MongoD Stream2 App Stream3 --> ---> MongoD Stream3 

The question is, how can I transfer data from Mongo without requiring constant polling / querying the database?

The obvious answer to the question is β€œwhy don’t you change these application threading streams for sending messages to a queue, such as Rabbit, Zero or ActiveMQ, which then sends them to your Mongo Streaming and Mongo streams right away”:

  MONGODB /|\ | App Stream1 --> | ---> MongoD Stream1 App Stream2 --> SomeMQqueue ---> MongoD Stream2 App Stream3 --> ---> MongoD Stream3 

In an ideal world, yes, that would be nice, but we need Mongo to ensure that messages are saved first, to avoid duplication, and to ensure that all identifiers are generated, etc. Mongo should sit in the middle like a persistence layer.

So, how can I transfer messages from the Mongo collection (without using GridFS, etc.) to these streaming applications. The main school of thought was simply to interrogate new documents, and each collected document updated it by adding another field to the JSON documents stored in the database, just like the process flag in the SQL table in which the processed timestamp is stored. That is, every 1st survey for documents where == null .... add is processed is processed = now () .... update the document.

Is there a tidier / more computationally efficient method?

FYI is all Java processes.

Hurrah!

+6
source share
1 answer

If you are writing a limited collection (or collections), you can use tailablecursor to enter new data into the stream or to the message queue from where it can be derived. However, this will not work for a collection without restrictions.

+3
source

Source: https://habr.com/ru/post/895719/


All Articles