Mongo Inline Document Request

I have 2 dynamic documents:

class Tasks(db.DynamicDocument): task_id = db.UUIDField(primary_key=True,default=uuid.uuid4) name = db.StringField() flag = db.IntField() class UserTasks(db.DynamicDocument): user_id = db.ReferenceField('User') tasks = db.ListField(db.ReferenceField('Tasks'),default=list) 

I want to filter out a UserTasks document, checking if the flag value (from the Tasks document) of the given task_id is 0 or 1 given by task_id and user_id. Therefore, I request the following: -

 obj = UserTasks.objects.get(user_id=user_id,tasks=task_id) 

This returns me a UserTask object.

Now I look at the list of tasks and first get the equivalent task, and then check its flag value as follows.

 task_list = obj.tasks for t in task_list: if t['task_id'] == task_id: print t['flag'] 

Is there a better / direct way to query the UserTasks Document to get the flag value in the Tasks document.

PS: I could get the direct flag value from the Tasks document, but I also need to check if the task is associated with the user or not. So I directly requested the UserTasks document.

+5
source share
2 answers

Is it possible to directly filter a document with ReferenceField's fields in a single query?

No, it is not possible to directly filter a document with ReferenceField fields, as this will require joins, and mongodb does not support joins.

According to MongoDB docs in database links:

MongoDB does not support joins. In MongoDB, some data is denormalized, or stored with related data in documents to remove the need for connections.

From another page on the official website:

If we used a relational database, we could join users and stores, and get all our objects in one query. But MongoDB does not support joins, and therefore, a bit of denormalization is required from time to time.

Relational purists already feel awkward, as if we were breaking some universal law. But let's keep in mind that MongoDB collections are not equivalent to relational tables; each serves a unique design purpose. A normalized table provides an atomic, isolated piece of data. However, the document more accurately represents the object as a whole.

So, in 1 request, we cannot both filter tasks with a specific flag value and with user_id and task_id in the UserTasks model.

How to filter?

In order to filter according to the required conditions, we will need to complete 2 queries.

In the first query, we will try to filter out the tasks model with task_id and flag data. Then, in the second query, we will filter the UserTasks model with the user_id and task data obtained from the first query.

Example:

Suppose we have user_id , task_id , and we need to check if the associated flag task has a value of 0 .

1st request

First we select my_task with task_id and flag as 0 .

 my_task = Tasks.objects.get(task_id=task_id, flag=0) # 1st query 

Second request

Then in the second query, you need to filter out the UserTask model with the given user_id and my_task .

 my_user_task = UserTasks.objects.get(user_id=user_id, tasks=my_task) # 2nd query 

You should execute the second request only if you get my_task object with task_id and flag data. In addition, you will need to add error handling if there are no consistent objects.

What if we used an EmbeddedDocument for the tasks model?

Suppose we defined our tasks document as an EmbeddedDocument field and tasks in the UserTasks model as EmbeddedDocumentField , then we could do something like this to perform the necessary filtering:

 my_user_task = UserTasks.objects.get(user_id=user_id, tasks__task_id=task_id, tasks__flag=0) 

Getting a specific my_task from a task list

In the above query, a UserTask document will be returned that will contain all the tasks . Then we need to do some iteration to get the right task.

To do this, we can check the list with enumerate() . Then the desired index will be the first element of the returned list of 1 element.

 my_task_index = [i for i,v in enumerate(my_user_task.tasks) if v.flag==0][0] 
+2
source

@Praful, based on your schema, you will need two queries because mongodb does not have joins, so if you want to get "all data" in one query, you need a schema that matches this case. ReferenceField is a special field that lazily loads another collection (this requires a request).

Based on the request that you need, I recommend that you change the schema accordingly. The idea of ​​NOSQL engines is "denormalization", so it's nice to have an EmbeddedDocument list. An EmbeddedDocument can be a smaller document (denormalized version) with a set of fields instead of all.

If you do not want to load the entire document into memory at the time of the request, you can exclude these fields using "projection". The user task call has an EmbeddedDocument list with a task that you could perform:

 UserTasks.objects.exclude('tasks').filter(**filters) 

Hope this helps you.

Good luck

0
source

Source: https://habr.com/ru/post/1232125/


All Articles