How to remove empty string from mongodb collection?

I have "mollodb colllenctions" and I would like to remove the "empty lines" using the keys from it.

From this:

{ "_id" : ObjectId("56323d975134a77adac312c5"), "year" : "15", "year_comment" : "", } { "_id" : ObjectId("56323d975134a77adac312c5"), "year" : "", "year_comment" : "asd", } 

I would like to get this result:

 { "_id" : ObjectId("56323d975134a77adac312c5"), "year" : "15", } { "_id" : ObjectId("56323d975134a77adac312c5"), "year_comment" : "asd", } 

How can I solve it?

+5
source share
3 answers

Try the following code snippet in a Mongo shell that separates fields with null or null values

 var result=new Array(); db.getCollection('test').find({}).forEach(function(data) { for(var i in data) { if(data[i]==null || data[i]=='') { delete data[i] } } result.push(data) }) print(tojson(result)) 
+2
source

Let's start by getting a separate list of all the keys in the collection, use these keys as the basis of the request, and do an ordered bulk update using the Bulk API operations. The update statement uses the $unset statement to remove fields.

The mechanism for obtaining a separate list of keys that you need to build a request is possible through Map-Reduce . In the following mapreduce operation, a separate collection will be added with all keys as _id values:

 mr = db.runCommand({ "mapreduce": "my_collection", "map" : function() { for (var key in this) { emit(key, null); } }, "reduce" : function(key, stuff) { return null; }, "out": "my_collection" + "_keys" }) 

To get a list of all the dynamic keys, run the selection in the resulting collection:

 db[mr.result].distinct("_id") // prints ["_id", "year", "year_comment", ...] 

Now, given the list above, you can assemble your request by creating an object that will have its properties defined in the loop. Typically, your request will have the following structure:

 var keysList = ["_id", "year", "year_comment"]; var query = keysList.reduce(function(obj, k) { var q = {}; q[k] = ""; obj["$or"].push(q); return obj; }, { "$or": [] }); printjson(query); // prints {"$or":[{"_id":""},{"year":""},{"year_comment":""}]} 

You can then use the Bulk API (available with MongoDB 2.6 and above) as a way to optimize your updates to improve performance with the request above. In general, you should be able to work as:

 var bulk = db.collection.initializeOrderedBulkOp(), counter = 0, query = {"$or":[{"_id":""},{"year":""},{"year_comment":""}]}, keysList = ["_id", "year", "year_comment"]; db.collection.find(query).forEach(function(doc){ var emptyKeys = keysList.filter(function(k) { // use filter to return an array of keys which have empty strings return doc[k]===""; }), update = emptyKeys.reduce(function(obj, k) { // set the update object obj[k] = ""; return obj; }, { }); bulk.find({ "_id": doc._id }).updateOne({ "$unset": update // use the $unset operator to remove the fields }); counter++; if (counter % 1000 == 0) { // Execute per 1000 operations and re-initialize every 1000 update statements bulk.execute(); bulk = db.collection.initializeOrderedBulkOp(); } }) 
+3
source

If you need to update one empty parameter or you want to make a parameter by parameter, you can use the mongo updateMany function:

 db.comments.updateMany({year: ""}, { $unset : { year : 1 }}) 
0
source

Source: https://habr.com/ru/post/1235005/


All Articles