Mongodb groupby slow even after adding index

Question

Mongodb groupby slow even after adding index

I have a simple collection:

{ "_id" : ObjectId("5033cc15f31e20b76ca842c8"), "_class" : "com.pandu.model.alarm.Alarm", "serverName" : "CDCAWR009 Integration Service", "serverAddress" : "cdcawr009.na.convergys.com", "triggered" : ISODate("2012-01-28T05:09:03Z"), "componentName" : "IntegrationService", "summary" : "A device which is configured to be recorded is not being recorded.", "details" : "Extension<153; 40049> on CDCAWR009 is currently not being recorded properly; recording requested for the following reasons: ", "priority" : "Major" }

the collection will have about two million such documents. I am trying to group by server name and get the number of all server names. Sounds easy in terms of RDBMS queries.

 The query that I have come up with is db.alarm.group( {key: { serverName:true }, reduce: function(obj,prev) { prev.count++ }, initial: { count: 0 }});

In addition, I added an index to serverName.

 > db.alarm.getIndexes() [ { "v" : 1, "key" : { "_id" : 1 }, "ns" : "test.alarm", "name" : "_id_" }, { "v" : 1, "key" : { "serverName" : 1 }, "ns" : "test.alarm", "name" : "serverName_1" } ]

However, I get a response in mongodb after 13 seconds. whereas in the sql server a similar query is returned back within 4 seconds, which is also without an index.

Is there something I am missing?

Thanks pending.

+3

mongodb

nkare Aug 21 '12 at 20:31

source share

2 answers

Adam Comerford · Answer 1 · 2012-08-21 21:23

As you can see from the query you wrote, this type of aggregation in 2.0 requires you to run Map / Reduce. Map / Reduce on MongoDB has some performance penalties that have been covered on SO before - basically, if you cannot parallelize in a cluster, you will run single-threaded javascript through Spidermonkey - not a quick suggestion. The index, since you are not selective, does not really help - you just need to scan the entire index, as well as potentially the document.

With imminent version 2.2 (currently in rc1 at the time of this writing) you have some options. The aggregation structure (which is native and not based on JS Map / Reduce), presented in 2.2, has a built-in group operator and was created specifically to speed up such operations in MongoDB.

I would recommend giving a 2.2 shot and see if your grouping performance improves. I think it will look something like this (note: not verified):

 db.alarm.aggregate( { $group : { _id : "$serverName", count : { $sum : 1 } }} );

Jenna · Answer 2 · 2012-08-21 21:25

Another option and, perhaps, the most effective solution at the moment can be to use the distinct () command and to calculate the results on the client side. http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Distinct

Mongodb groupby slow even after adding index

More articles: