We have an application that can greatly benefit from using a document-based data warehouse, such as CouchDB. But we have a case of using a query that I am trying to implement using Map Reduce.
Our documents really contain only two types of data:
- Numeric attributes
- Logical attributes
Boolean attributes essentially mark a document as belonging to one or more non-exclusive sets. Numeric attributes always need only be summarized. One way to structure a document is as follows:
{
"id": 3123123,
"attr": {"x": 2, "y": 4, "z": 6},
"sets": ["A", "B", "C"]
}
Using this structure, it is easy to calculate the aggregate values of x, y, z for the sets A, B, and C, but it becomes more complicated if you want to see aggregates for intersections such as A & C.
ABC ( "A, B, C, AB, AC, BC, ABC" ), , . 80 , , .
, CouchDB, , , MongoDB - .
- ?