Mongodb: Avoiding Blocking Large Collection Updates

I have a collection of events elements 2.502.011 and you want to update for all elements. Unfortunately, I encountered many Mongodb errors due to write blocking.

Question: How can I avoid these errors to ensure that all my events have been updated correctly?

Here is information about my event collection:

 > db.events.stats() { "count" : 2502011, "size" : 2097762368, "avgObjSize" : 838.4305136947839, "storageSize" : 3219062784, "numExtents" : 21, "nindexes" : 6, "lastExtentSize" : 840650752, "paddingFactor" : 1.0000000000874294, "systemFlags" : 0, "userFlags" : 0, "totalIndexSize" : 1265898256, "indexSizes" : { "_id_" : 120350720, "destructured_created_at_1" : 387804032, "destructured_updated_at_1" : 419657728, "data.assigned_author_id_1" : 76053152, "emiting_class_1_data.assigned_author_id_1_data.user_id_1_data.id_1_event_type_1" : 185071936, "created_at_1" : 76960688 } } 

Here's what the event looks like:

 > db.events.findOne() { "_id" : ObjectId("4fd5d4586107d93b47000065"), "created_at" : ISODate("2012-06-11T11:19:52Z"), "data" : { "project_id" : ObjectId("4fc3d2abc7cd1e0003000061"), "document_ids" : [ "4fc3d2b45903ef000300007d", "4fc3d2b45903ef000300007e" ], "file_type" : "excel", "id" : ObjectId("4fd5d4586107d93b47000064") }, "emiting_class" : "DocumentExport", "event_type" : "created", "updated_at" : ISODate("2013-07-31T08:52:48Z") } 

I would like to update each event to add 2 new field bases to existing created_at and updated_at . Please correct me if I am wrong, but it seems that you cannot use the mongo update command when you need to access the current item data along the way.

This is my update loop:

 db.events.find().forEach( function (e) { created_at = new Date(e.created_at); updated_at = new Date(e.updated_at); e.destructured_created_at = [e.created_at]; // omitted the actual values e.destructured_updated_at = [e.updated_at]; // omitted the actual values db.events.save(e); } ) 

When executing the above command, I get a huge number of page errors due to a write lock in the database.

mongostat

+4
source share
1 answer

I think you are confused here, it does not cause write locks, it is MongoDB requesting your update documents; the lock does not exist during the page crash (in fact, it exists only when the actual update or, rather, saving the document to disk), it is inferior to other operations.

The castle is more mutex in MongoDB.

Page errors in this data size are perfectly normal, since you obviously don’t often request this data, I’m not sure what you expect to see. I'm definitely not sure what you mean by your question:

Question: How can I avoid these errors to make sure that all my events are updated correctly?

Well, the problem that you may encounter is that you get page rewinding on this machine, in turn destroying your I / O bandwidth and putting the desktop on with data that is not needed. You really need to add this field to ALL documents with impatience, can it be added on demand by the application when this data is used again?

Another option is to do it in batches.

One of the features you can use here is priority queues, which dictate that such an update is a background task that should not affect the current operation of your mongod too much. I hear such a function (I can not find JIRA :/ ).

Please correct me if I am wrong, but it seems you cannot use the mongo update command when you need to access the current item data along the way.

You're right.

+6
source

Source: https://habr.com/ru/post/1494669/


All Articles