Based on the information you provided, I would recommend two possible approaches, starting from the same ground:
Use two collections (articles and platforms) and store only a link to platform documents in the array defined in the article documents
I would recommend this approach if:
- You have high power of both article documents as well as the platform
You want to be able to manage both entities yourself, while also synchronizing links between them
// articles collection schema { "_id": ..., "title": "I am an article", ... "platforms": [ "platform_1", "platform_2", "platform_3" ], ... } // platforms collection schema { "_id": "platform_1", "name": "Platform 1", "url": "http://right/here", ... }, { "_id": "platform_2", "name": "Platform 2", "url": "http://right/here", ... }, { "_id": "platform_3", "name": "Platform 3", "url": "http://right/here", ... }
Even if this approach is quite flexible, it is expensive - if you need data both in the article and in the platform, you will have to run more queries on the MongoDB instance, since the data is divided into two different collections.
For example, when loading an article page, assuming that you also want to display the platforms
list, you need to run a query in the articles collection
and then run a search in the platforms collection
to retrieve all platform objects whose publication is published through the platform
array elements on the article document
.
However, if you have only a small subset of the frequently available platform attributes
that you need to have when loading the article document
, you can improve the platforms
array in the articles collection
to store these attributes in addition to the _id
link to platform documents:
// enhanced articles collection schema { "_id": ..., "title": "I am an article", ... "platforms": [ {platform_id: "platform_1", name: "Platform 1"}, {platform_id: "platform_2", name: "Platform 2"}, {platform_id: "platform_3", name: "Platform 3"} ], ...
}
This hybrid approach would be appropriate if the frequently repeated platform data attributes
, which you often extract to display along with article-specific data, often does not change.
Otherwise, you will have to synchronize all updates made to platform document attributes
in the platforms collection
with a subset of the attributes that you are tracking as part of the platform array for article documents.
As for managing article lists for specific platforms, I would not recommend storing N-to-N links in both collections, as the above mechanism already allows you to retrieve article lists by querying articles collection
using a search query with the _id
value of platform document
:
Approach #1 db.articles.find({"platforms": "platform_1"}); Approach #2: db.articles.find({"platforms.platform_id": "platform_1"});
Introducing two different approaches, I would recommend that you analyze the query patterns and performance thresholds of your application and make a design decision based on the scenarios that you encounter.