MongoDB C # collection.Save vs Insert + Update

From the C # documentation:

The Save method is a combination of Insert and Update. If the element of the document identifier matters, then it is assumed that it is an existing document and saves the update of calls in the document (setting the Upsert flag just in case it is actually a new document).

I manually create my identifiers in the base class from which all objects of my domain are inherited. Thus, all objects in my domain have IDs when they are inserted into MongoDB.

Questions: should I use collection.Save and keep my interface simple or does it actually lead to some overhead in the Save-call (with the Upsert flag), and should I use the .Insert and Update collection instead?

I think the Save method first calls Update, and then finds out that my new object did not exist in the first place, and then calls Insert instead. Am I mistaken? Has anyone checked this out?

Note. I am inserting massive data using InsertBatch, so big data will not matter in this case.

Change, continue

I wrote a small test to find out if the Update application with the Upsert flag has some overhead, so Insert might be better. It turned out that they are working at the same speed. See my test code below. MongoDbServer and IMongoDbServer are my own common interface for storage isolation.

IMongoDbServer server = new MongoDbServer(); Stopwatch sw = new Stopwatch(); long d1 = 0; long d2 = 0; for (int w = 0; w <= 100; w++) { sw.Restart(); for (int i = 0; i <= 10000; i++) { ProductionArea area = new ProductionArea(); server.Save(area); } sw.Stop(); d1 += sw.ElapsedMilliseconds; sw.Restart(); for (int i = 0; i <= 10000; i++) { ProductionArea area = new ProductionArea(); server.Insert(area); } sw.Stop(); d2 += sw.ElapsedMilliseconds; } long a1 = d1/100; long a2 = d2/100; 
+6
source share
1 answer

The Save method is not going to make two trips to the server.

The heuristic is this: if the saved document does not matter for the _id field, then a value is generated for it, and then Insert is called. If the saved document has a non-zero value for _id, then Update is called with the Upsert flag, in which case the server must decide whether to insert or update.

I don't know if Upsert is more expensive than an insert. I suspect that they are almost the same, and what really matters is that, in any case, this is the only round trip.

If you know this is a new document, you can also call Insert. And calling the InsertBatch function is more efficient than calling many individual attachments. Therefore, definitely prefer InsertBatch to save.

+12
source

Source: https://habr.com/ru/post/902056/


All Articles