How to store an ordered set of documents in MongoDB without using a private collection

What a good way to store a collection of documents in MongoDB where order is important? I need to easily insert documents in an arbitrary position and, possibly, change their order later.

I can assign an increasing number to each element and sort by it, or I could sort by _id , but I don’t know how I could insert another document between other documents. Say I want to insert something between an element with sequence of 5 and an element with sequence of 6 ?

My first guess was to increment the sequence all the following elements so that the space for the new element would be used using a query like db.items.update({"sequence":{$gte:6}}, {$inc:{"sequence":1}}) . My limited understanding of database administration tells me that such a query would be slow and usually a bad idea, but I'm glad it fixed.

I think I could set the new sequence element to 5.5 , but I think it will be pretty messy. (Again, correct me if I am wrong.)

I could use a limited collection that has a guaranteed order, but then I would run into problems if I needed to put together a collection. (But then again, I could be wrong about that).

I could have in every document a link to the next document, but that would require a query for each item in the list. (You will get an element, click on the array of results and get another element based on the next field of the current element.) Besides the obvious performance issues, I also won’t be able to pass the sorted mongo cursor to my blockbars expression {#each} and let it update when changing the database. (I use the Meteor full stack framework.)

I know that everyone has advantages and disadvantages, and I just need to use one of the options listed above, but I would like to know if there is a better way to do something.

+5
source share
3 answers

According to your requirement, one of the approaches may be to develop your scheme so that each document has the ability to store more than one document and in itself acts as a closed container.

 { "_id":Number, "doc":Array } 

Each document in the collection will act as a closed container, and documents will be stored as an array in the doc field. The doc field, which is an array, will maintain the insertion order. You can limit the number of documents to n . Thus, the _id field of each container document will be incremented by n , indicating the number of documents that the container document can store.

By doing this, you avoid adding extra fields to the document, extra indices , unnecessary sorts .

Insert the very first record

ie when the collection is empty.

 var record = {"name" : "first"}; db.col.insert({"_id":0,"doc":[record]}); 

Insert subsequent entries

  • Define the last container document _id and number documents that it has.
  • If the number of documents that it stores is less than n , then update the container document with a new document, otherwise create a new container document.

Let's say that each container document can contain a maximum of 5 documents, and we want to insert a new document.

 var record = {"name" : "newlyAdded"}; // using aggregation, get the _id of the last inserted container, and the // number of record it currently holds. db.col.aggregate( [ { $group : { "_id" : null, "max" : { $max : "$_id" }, "lastDocSize" : { $last : "$doc" } } }, { $project : { "currentMaxId" : "$max", "capSize" : { $size : "$lastDocSize" }, "_id" : 0 } // once obtained, check if you need to update the last container or // create a new container and insert the document in it. } ]).forEach( function(check) { if (check.capSize < 5) { print("updating"); // UPDATE db.col.update( { "_id" : check.currentMaxId }, { $push : { "doc" : record } }); } else { print("inserting"); //insert db.col.insert( { "_id" : check.currentMaxId + 5, "doc" : [ record ] }); } }) 

Please note that aggregation works on the server side and is very efficient, also note that aggregation will return a document to you and not the cursor in previous to 2.6 versions. Therefore, you will need to modify the above code to just select from a single document, rather than iterate over the cursor.

Insert a new document between documents

Now, if you want to insert a new document between documents 1 and 2 , we know that the document must be inside the container with _id=0 and should be placed in the second position in the doc this container.

therefore, we use the $each and $position operators to insert at specific positions.

 var record = {"name" : "insertInMiddle"}; db.col.update( { "_id" : 0 }, { $push : { "doc" : { $each : [record], $position : 1 } } } ); 

Stream processing

Now we need to take care of the overflowing documents in each container , say, we insert a new document between them, in the container with _id=0 . If there are already 5 documents in the container, we need to move the last document to the next container and do this until all containers store documents within their capacity, if necessary, finally, we need to create a container for storing overflow documents .

This complex operation must be performed on the server side. . To handle this, we can create a script, for example, below, and register with mongodb.

 db.system.js.save( { "_id" : "handleOverFlow", "value" : function handleOverFlow(id) { var currDocArr = db.col.find( { "_id" : id })[0].doc; print(currDocArr); var count = currDocArr.length; var nextColId = id + 5; // check if the collection size has exceeded if (count <= 5) return; else { // need to take the last doc and push it to the next capped // container array print("updating collection: " + id); var record = currDocArr.splice(currDocArr.length - 1, 1); // update the next collection db.col.update( { "_id" : nextColId }, { $push : { "doc" : { $each : record, $position : 0 } } }); // remove from original collection db.col.update( { "_id" : id }, { "doc" : currDocArr }); // check overflow for the subsequent containers, recursively. handleOverFlow(nextColId); } } 

So, after every insertion in between , we can call this function by passing the container identifier, handleOverFlow(containerId) .

Retrieving all records in order

Just use the $unwind operator in the aggregate pipeline .

 db.col.aggregate([{$unwind:"$doc"},{$project:{"_id":0,"doc":1}}]); 

Documents for renewal

You can store each document in a closed container with the "_id" field:

 .."doc":[{"_id":0,","name":"xyz",...}..].. 

Get the "doc" array from the closed container that you want to reorder the elements.

 var docArray = db.col.find({"_id":0})[0]; 

Update your identifiers to change the order of the element after sorting.

Sort an array based on their _ids.

 docArray.sort( function(a, b) { return a._id - b._id; }); 

update the closed container back with the new doc array.

But then again, it all comes down to which approach is possible and best suited to your requirement.

Getting to your questions:

What a good way to store a collection of documents in MongoDB where order is important? I need to easily insert documents in an arbitrary position and possibly reorder later.

Documents as arrays.

Say I want to insert something between an element with a sequence of 5 and an element with a sequence of 6?

use the $each and $position operators in the db.collection.update() function, as shown in my answer.

My limited understanding of database administration tells me that such a request will be slow and usually a bad idea, but I'm happy to fix it.

Yes. This would affect performance if there was less data in the collection.

I could use a limited collection that has a guaranteed order, but then I would run into problems if I needed to put together a collection. (Yet again, I could be wrong about that.)

Yes. With Capped Collections, you can lose data.

+4
source

For arbitrary sorting of any collection, you will need a field to sort it. I call my "sequence."

 schema: { _id: ObjectID, sequence: Number, ... } db.items.ensureIndex({sequence:1}); db.items.find().sort({sequence:1}) 
+1
source

The _id field in MongoDB is a unique indexed key, similar to the primary key in relational databases. If your documents have their own order, ideally you should be able to associate a unique key with each document with a key value that reflects the order. Therefore, when preparing a document for insertion, explicitly add the _id field as this key (if you do not, mongo will automatically create it using the BSON objectid).

As for getting results, MongoDB does not guarantee the order in which documents are returned unless you explicitly use .sort() . If you do not use .sort() , the results are usually returned in natural order (insertion order). There is really no guarantee for this behavior.

I would advise you to override _id your order on insert and use sort on retrieve. Because _id is a necessary and automatically indexed entity, you won’t waste any space defining the sort key and saving the index for it.

+1
source

Source: https://habr.com/ru/post/1204121/


All Articles