In MongoDB, given by the find () operator, which returns a cursor for a set of rows, what is an idiomatic and time-efficient way to return βcontextualβ rows, that is, rows sequentially before and / or after each row in a set?
For me, the easiest way to explain this concept is to use ack , which supports contextual searching. Given the file:
line 1 line 2 line 3 line 4 line 5 line 6 line 7 line 8
This is the result from ack:
C:\temp>ack.pl -C 2 "line 4" test.txt line 2 line 3 line 4 line 5 line 6
I store log data in a MongoDB collection, one document per line. Each magazine, each of which is tagged with keywords, and these keywords are indexed, which gives me a full-text search.
I follow the swamp standard:
collection.find({keywords: {'$all': ['key1', 'key2']}}, {}).sort({datetime: -1});
and get the cursor. At this point, without adding any additional fields, what is the approach to get the context? I think the stream looks something like this:
- For each line in the cursor:
- Get the _id field, save to x.
- execute: collection.find ({_ id: {'$ gt': x}}). limit (N)
- Get results from each of these cursors.
- execute: collection.find ({_ id: {'$ lt': x}}) sort ({_ id: 1}). limit (N)
- Get results from each of these cursors.
For a result set with R rows, this requires 2R + 1 queries.
However, I think I can exchange space for time. Is there an alternative to updating each line with its _id context in the background? For a given line that currently has fields:
_id, contents, keywords
I would add an extra field:
_id, contents, keywords, context_ids
and then in a subsequent search, I could somehow use these context_ids, I think? I am not familiar with MongoDB MapReduce yet, but can it also get on the photo?
I think the most direct approach is to store the full text of the actual context lines in each line, but for me it is a little rude. The clear advantage is that a single request can return the context I need.
I appreciate any and all answers that agree on the scope of the question. I understand that I can use Lucene or a true full-text search engine out of range, but I'm trying to feel the edges and possibilities of MongoDB, so I will be grateful to the specific answers of MongoDB. Thanks!