According to your requirement, one of the approaches may be to develop your scheme so that each document has the ability to store more than one document and in itself acts as a closed container.
{ "_id":Number, "doc":Array }
Each document in the collection will act as a closed container, and documents will be stored as an array in the doc field. The doc field, which is an array, will maintain the insertion order. You can limit the number of documents to n . Thus, the _id field of each container document will be incremented by n , indicating the number of documents that the container document can store.
By doing this, you avoid adding extra fields to the document, extra indices , unnecessary sorts .
Insert the very first record
ie when the collection is empty.
var record = {"name" : "first"}; db.col.insert({"_id":0,"doc":[record]});
Insert subsequent entries
- Define the last container document
_id and number documents that it has. - If the number of documents that it stores is less than
n , then update the container document with a new document, otherwise create a new container document.
Let's say that each container document can contain a maximum of 5 documents, and we want to insert a new document.
var record = {"name" : "newlyAdded"}; // using aggregation, get the _id of the last inserted container, and the // number of record it currently holds. db.col.aggregate( [ { $group : { "_id" : null, "max" : { $max : "$_id" }, "lastDocSize" : { $last : "$doc" } } }, { $project : { "currentMaxId" : "$max", "capSize" : { $size : "$lastDocSize" }, "_id" : 0 } // once obtained, check if you need to update the last container or // create a new container and insert the document in it. } ]).forEach( function(check) { if (check.capSize < 5) { print("updating"); // UPDATE db.col.update( { "_id" : check.currentMaxId }, { $push : { "doc" : record } }); } else { print("inserting"); //insert db.col.insert( { "_id" : check.currentMaxId + 5, "doc" : [ record ] }); } })
Please note that aggregation works on the server side and is very efficient, also note that aggregation will return a document to you and not the cursor in previous to 2.6 versions. Therefore, you will need to modify the above code to just select from a single document, rather than iterate over the cursor.
Insert a new document between documents
Now, if you want to insert a new document between documents 1 and 2 , we know that the document must be inside the container with _id=0 and should be placed in the second position in the doc this container.
therefore, we use the $each and $position operators to insert at specific positions.
var record = {"name" : "insertInMiddle"}; db.col.update( { "_id" : 0 }, { $push : { "doc" : { $each : [record], $position : 1 } } } );
Stream processing
Now we need to take care of the overflowing documents in each container , say, we insert a new document between them, in the container with _id=0 . If there are already 5 documents in the container, we need to move the last document to the next container and do this until all containers store documents within their capacity, if necessary, finally, we need to create a container for storing overflow documents .
This complex operation must be performed on the server side. . To handle this, we can create a script, for example, below, and register with mongodb.
db.system.js.save( { "_id" : "handleOverFlow", "value" : function handleOverFlow(id) { var currDocArr = db.col.find( { "_id" : id })[0].doc; print(currDocArr); var count = currDocArr.length; var nextColId = id + 5; // check if the collection size has exceeded if (count <= 5) return; else { // need to take the last doc and push it to the next capped // container array print("updating collection: " + id); var record = currDocArr.splice(currDocArr.length - 1, 1); // update the next collection db.col.update( { "_id" : nextColId }, { $push : { "doc" : { $each : record, $position : 0 } } }); // remove from original collection db.col.update( { "_id" : id }, { "doc" : currDocArr }); // check overflow for the subsequent containers, recursively. handleOverFlow(nextColId); } }
So, after every insertion in between , we can call this function by passing the container identifier, handleOverFlow(containerId) .
Retrieving all records in order
Just use the $unwind operator in the aggregate pipeline .
db.col.aggregate([{$unwind:"$doc"},{$project:{"_id":0,"doc":1}}]);
Documents for renewal
You can store each document in a closed container with the "_id" field:
.."doc":[{"_id":0,","name":"xyz",...}..]..
Get the "doc" array from the closed container that you want to reorder the elements.
var docArray = db.col.find({"_id":0})[0];
Update your identifiers to change the order of the element after sorting.
Sort an array based on their _ids.
docArray.sort( function(a, b) { return a._id - b._id; });
update the closed container back with the new doc array.
But then again, it all comes down to which approach is possible and best suited to your requirement.
Getting to your questions:
What a good way to store a collection of documents in MongoDB where order is important? I need to easily insert documents in an arbitrary position and possibly reorder later.
Documents as arrays.
Say I want to insert something between an element with a sequence of 5 and an element with a sequence of 6?
use the $each and $position operators in the db.collection.update() function, as shown in my answer.
My limited understanding of database administration tells me that such a request will be slow and usually a bad idea, but I'm happy to fix it.
Yes. This would affect performance if there was less data in the collection.
I could use a limited collection that has a guaranteed order, but then I would run into problems if I needed to put together a collection. (Yet again, I could be wrong about that.)
Yes. With Capped Collections, you can lose data.