How to accept the average of big data in MongoDB and CouchDB?

I look at this chart ...

http://www.mongodb.org/display/DOCS/MongoDB,+CouchDB,+MySQL+Compare+Grid

... which says:

Request method

CouchDB - display / decrease javascript functions for lazy index assembly for each request

MongoDB - dynamic; object-based query language

What exactly does this mean? For example, if I want to accept an average of 1,000,000,000 values, does CouchDB do this automatically using the MapReduce method?

Can someone get me through how to take an average of 1,000,000,000 values โ€‹โ€‹in both systems ... this will be a very striking example.

Thanks.

+6
source share
2 answers

Views CouchDB is a strange and fascinating beast.

CouchDB does an incremental map / reduction, that is, as soon as you specify your โ€œviewโ€, it will work as a materialized view from a relational database. It doesnโ€™t matter if you average 3 or 3 billion documents. There is a result.

But there is a triple gotcha there

1) the request is executed quickly after creating and updating the view. Viewing creation can be slow if you have many small documents (if possible, come with more complete documents). After creating the view, the stages of intermediate recovery are stored inside the nodes of the B-tree, and you do not have to compromise them.

2) Views are updated lazily upon request. To have predictable performance, you'd better tweak some kind of work to regularly update them. How do you plan on updating indexes in CouchDB

3) You need to have a pretty good idea of โ€‹โ€‹how you will query your data using compound keys, ranges, and groupings. CouchDB sucks when performing special requests. http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views

Iโ€™m sure that someone will soon publish information on how to average 1,000,000,000 items in both databases, but you should understand that CouchDB forces you to do more upfront work to benefit from its gradual approach. This is really something completely unique, but not really intended for scenarios when you do averages or anything in special requested data.

In Mongo, you can use either map / reduce (not incremental). It doesn't matter if you average 3 or 3 billion documents, but mongo is considered incredibly fast due to its memory-attached I / O) or their aggregation. http://www.mongodb.org/display/DOCS/Aggregation

+8
source

I canโ€™t talk about MongoDB, but I can talk about CouchDB. CouchDB can only be called using the Map / Reduce View Engine. In fact, a great place to start is the wiki section .

The view contains the map function and an additional reduce function. JavaScript is a typical language for writing these functions, but there is an Erlang option, and you can build a viewing mechanism in almost any other programming language.

The map function is used to create a dataset from documents in the database. The reduction function aggregates this data set. Thus, the display function runs on each individual document in the database after creating the view. (and first request). After creation, this function is performed only in the document, which was either created or modified / deleted. Thus, presentation indices are built gradually , but not dynamically.

With 1,000,000,000 values, CouchDB will not need to calculate the results of your query every time it queries. Instead, it will only report the value of the index that it has saved, which itself changes only when creating / updating / deleting a document.

As for writing Map / Reduce functions, most of this work is left to the programmer, since there are no built-in map functions. (i.e. it is not "automatic"). However, there are several built-in reduction functions ( _sum , _count , _stats ).

Here is a simple example, we will calculate the average height of some people.

 // sample documents { "_id": "Dominic Barnes", "height": 64 } { "_id": "Some Tall Guy", "height": 75 } { "_id": "Some Short(er) Guy", "height": 58 } // map function function (doc) { // first param is "key", which we do not need since `_id` is stored anyways emit(null, doc.height); } // reduce function _stats 

The results of this view will look like this:

 { "rows": [ { "key": null "value": { "sum": 197, "count": 3, "min": 58, "max": 75, "sumsqr": 13085 } } ] } 

Calculating the average from here is as simple as dividing the amount into the account. If you want the average to be calculated in the view itself, you can check out this example .

+8
source

Source: https://habr.com/ru/post/892643/


All Articles