I’m thinking about MongoDB’s attempt to use it to store our statistics, but there are some general questions about whether I understand this correctly before I begin to study it.
I understand the concept of using documents, which I don’t understand too much - how much data can be stored inside each document. The following diagram explains the layout I'm thinking of:
Website (document)
- some keys/values about the particular document
- statistics (tree)
- millions of rows where each record is inserted from a pageview (key/value array containing data such as timestamp, ip, browser, etc)
What bothered me at mongodb were the grouping features such as:
http://www.mongodb.org/display/DOCS/Aggregation
db.test.group(
{ cond: {"invoked_at.d": {$gte: "2009-11", $lt: "2009-12"}}
, key: {http_action: true}
, initial: {count: 0, total_time:0}
, reduce: function(doc, out){ out.count++; out.total_time+=doc.response_time }
, finalize: function(out){ out.avg_time = out.total_time / out.count }
} );
But my main problem is how hard would this command be, for example, to be on the server if, for example, they say 10 million records on dozens of documents on the 512-1gb ram server server on rackspace? Will it work at low load?
, MongoDB ( )? , , ? , , , - /? , , .
!