Paginated chronological queries in multiple collections using MongoDB?

Is there an efficient way to make a range-based query for multiple collections sorted by index by timestamp? I basically need to pull the last 30 documents from 3 collections, and the obvious way would be to request each collection for the last 30 documents, and then filter and combine the result. However, this is somewhat inefficient.

Even if I were to select only the timestamp field in the request, execute the second batch of requests for the last 30 documents, I'm not sure if this is the best approach. This will be 90 documents (whole or one field) per page request.

Essentially, a client can be subscribed to articles, and each category of the article differs from 0 to 2 fields. I just chose 3, as this is the average number of articles that users have subscribed to so far in beta. Due to possible differences in the field, I did not think that it would be very consistent to place all articles of different types in one collection.

+4
source share
3 answers

MongoDB operations work on only one collection. Thus, you need to structure your schema with collections that match your needs.

Option A: get identifiers from collection support, download full documents, sort in memory

Thus, you need to either collect a collection that combines the identifiers, names of the main collection and timestamps from the three collections into one collection, and asks to get your 30 IDs / collections, and then download the corresponding complete documents with 3 additional requests (1 each for each main collection) and, of course, remember that they will not return in the correct combined order, so you need to manually sort this page of results manually before returning it to your client.

{ _id: ObjectId, updated: Date, type: String } 

This method allows mongo to paginate for you.

Option B: 3 Requests, Union, Sort, Limit

Or, as you said, load 30 documents from each collection, collect a set of associations in memory, omit the additional 60 and return the combined result. This avoids additional assembly costs and synchronization maintenance.

So, I think your current approach (option B, as I call it) is the smaller of the two not-so-good options.

+1
source

If your request really helps to get the latest articles based on a selection of categories, I suggest you:

A) Store all documents in one collection so that they can use one query to get the result with a combined call. If you do not have a very consistent date range in the collections, you will need to keep track of date ranges and count them so that you can intelligently get a set of documents that can be combined. 30 from one collection may be older than all from another. You can add an index for timestamps and categories, and then limit the results.

B) Load everything aggressively so you rarely have to merge

+1
source

You can use the same idea that I explained here, although this post is related to MongoDB text search, it applies to any type of query

MongoDB index optimization when using text search in aggregation structure

The idea is to query all your collections, sorting them by date and ID, and then sort / mix the results to return the first page. Subsequent pages are retrieved using the last document date and ID from the previous page.

0
source

Source: https://habr.com/ru/post/1499241/


All Articles