Indicate how many documents the field contains

I have three MongoDB documents:

{ "_id" : ObjectId("571094afc2bcfe430ddd0815"), "name" : "Barry", "surname" : "Allen", "address" : [ { "street" : "Red", "number" : NumberInt(66), "city" : "Central City" }, { "street" : "Yellow", "number" : NumberInt(7), "city" : "Gotham City" } ] } { "_id" : ObjectId("57109504c2bcfe430ddd0816"), "name" : "Oliver", "surname" : "Queen", "address" : { "street" : "Green", "number" : NumberInt(66), "city" : "Star City" } } { "_id" : ObjectId("5710953ac2bcfe430ddd0817"), "name" : "Tudof", "surname" : "Unknown", "address" : "homeless" } 

The address field is an Array objects in the first document, and Object in the second and String in the third. My goal is to find how many documents in my collection contain the address.street field. In this case, the correct score is 1, but with my request I get two:

 db.coll.find({"address.street":{"$exists":1}}).count() 

I also tried map / reduce. It works, but it is slower; therefore, if possible, I would avoid this.

+5
source share
2 answers

The difference here is that the .count() operation is actually “correct” when returning the “document” counter in which the field is located. Therefore, general considerations boil down to the following:

If you just want to exclude documents with an array field

Then the most effective way to exclude those documents where the "street" was the property of the "address" as an "array", then simply use the dot-notation property to search for index 0 so that it does not exist in the output:

 db.coll.find({ "address.street": { "$exists": true }, "address.0": { "$exists": false } }).count() 

As an online-coded statement test in both cases, $exists does the right job and is efficient.

If you planned to count entry fields

If what you are actually asking is a “field count”, where some “documents” contain array entries, where this “field” may be present several times.

To do this, you need an aggregation structure or mapReduce, as you mentioned. MapReduce uses JavaScript-based processing and therefore will be significantly slower than the .count() operation. The aggregation frame should also be computed and will be slower than .count() , but not as much as mapReduce.

In MongoDB 3.2, you get some help here due to the expanded ability of $sum to work with an array of values, as well as being a grouping battery. Another helper here is $isArray , which allows you to use a different processing method via $map when the data is actually an "array":

 db.coll.aggregate([ { "$group": { "_id": null, "count": { "$sum": { "$sum": { "$cond": { "if": { "$isArray": "$address" }, "then": { "$map": { "input": "$address", "as": "el", "in": { "$cond": { "if": { "$ifNull": [ "$$el.street", false ] }, "then": 1, "else": 0 } } } }, "else": { "$cond": { "if": { "$ifNull": [ "$address.street", false ] }, "then": 1, "else": 0 } } } } } } }} ]) 

Earlier versions depend on more conditional processing to process the data of the array and the data without the array in different ways and usually require $unwind to process the array records.

Moving an array through $map using MongoDB 2.6:

 db.coll.aggregate([ { "$project": { "address": { "$cond": { "if": { "$ifNull": [ "$address.0", false ] }, "then": "$address", "else": { "$map": { "input": ["A"], "as": "el", "in": "$address" } } } } }}, { "$unwind": "$address" }, { "$group": { "_id": null, "count": { "$sum": { "$cond": { "if": { "$ifNull": [ "$address.street", false ] }, "then": 1, "else": 0 } } } }} ]) 

Or providing conditional choices using MongoDB 2.2 or 2.4:

 db.coll.aggregate([ { "$group": { "_id": "$_id", "address": { "$first": { "$cond": [ { "$ifNull": [ "$address.0", false ] }, "$address", { "$const": [null] } ] } }, "other": { "$push": { "$cond": [ { "$ifNull": [ "$address.0", false ] }, null, "$address" ] } }, "has": { "$first": { "$cond": [ { "$ifNull": [ "$address.0", false ] }, 1, 0 ] } } }}, { "$unwind": "$address" }, { "$unwind": "$other" }, { "$group": { "_id": null, "count": { "$sum": { "$cond": [ { "$eq": [ "$has", 1 ] }, { "$cond": [ { "$ifNull": [ "$address.street", false ] }, 1, 0 ]}, { "$cond": [ { "$ifNull": [ "$other.street", false ] }, 1, 0 ]} ] } } }} ]) 

Thus, the last form "should" perform a little better than mapReduce, but probably not much.

In all cases, the logic boils down to using $ifNull as the "logical" form of $exists for the aggregation structure. When paired with $cond , a “true” result is obtained when the property is actually viewed, and false returned when it is not. This determines whether 1 or 0 returns to total accumulation through $sum .

Ideally, you have a modern version that can do this in one step $group , but otherwise you need a longer way.

+7
source

Can you try this:

 db.getCollection('collection_name').find({ "address.street":{"$exists":1}, "$where": "Array.isArray(this.address) == false && typeof this.address === 'object'" }); 

In the where clause, we exclude if the address is an array and Include address, if it is a type, is an object.

+3
source

Source: https://habr.com/ru/post/1247207/


All Articles