$ inc follower count, or should I use a collection to track them?

I load products through endless scrolling in pieces of 12 at a time.

Sometimes I can sort them by the number of followers that they have.

The following is a description of how I track the number of followers of each product.


In a separate collection it follows that due to data caching 16mb and the number of the following values ​​should be unlimited.

follow the pattern:

var FollowSchema = new mongoose.Schema({ user: { type: mongoose.Schema.ObjectId, ref: 'User' }, product: { type: mongoose.Schema.ObjectId, ref: 'Product' }, timestamp: { type: Date, default: Date.now } }); 

Product Following Scheme:

 var ProductSchema = new mongoose.Schema({ name: { type: String, unique: true, required: true }, followers: { type: Number, default: 0 } }); 

Whenever a user follows / unsubscribes from a product, I run this function:

 ProductSchema.statics.updateFollowers = function (productId, val) { return Product .findOneAndUpdateAsync({ _id: productId }, { $inc: { 'followers': val } }, { upsert: true, 'new': true }) .then(function (updatedProduct) { return updatedProduct; }) .catch(function (err) { console.log('Product follower update err : ', err); }) }; 

My questions:

1: Is there a chance that the added “follower” value inside the product may cause some kind of error that will lead to inconsistent / inconsistent data?

2: would it be better to write an aggregate for counting subscribers for each Product, or would it be too expensive / slower?

In the end, I would probably rewrite this in the DB graph as it seems more appropriate, but at the moment it is an exercise in mastering MongoDB.

+6
source share
3 answers

1 If you increase the value after inserting or decreasing after deleting, this is the probability of data inconsistency. For example, the insert was successful, but the increment failed.

2 Intuitively, aggregation is much more expensive than found in this case. I did a test to prove it.

First create 1000 users, 1000 products and 10,000 followers in random order. Then use this code for comparison.

 import timeit from pymongo import MongoClient db = MongoClient('mongodb://127.0.0.1/test', tz_aware=True).get_default_database() def foo(): result = list(db.products.find().sort('followers', -1).limit(12).skip(12)) def bar(): result = list(db.follows.aggregate([ {'$group': {'_id': '$product', 'followers': {'$sum': 1}}}, {'$sort': {'followers': -1}}, {'$skip': 12}, {'$limit': 12} ])) if __name__ == '__main__': t = timeit.timeit('foo()', 'from __main__ import foo', number=100) print('time: %f' % t) t = timeit.timeit('bar()', 'from __main__ import bar', number=100) print('time: %f' % t) 

exit:

 time: 1.230138 time: 3.620147 

Creating an index can speed up a query search.

 db.products.createIndex({followers: 1}) time: 0.174761 time: 3.604628 

And if you need attributes from the product, such as a name, you need another O (n) request.

My guess is that as you scale up, aggregation will be much slower. If necessary, I can compare the results of large-scale data.

0
source

For the number 1, if only the operations in this field increase and decrease, I think that everything will be fine with you. If you start to replicate this data or use it in connections for any reason, you risk incompatible data.

For number 2, I recommend that you run both scripts in the mongo shell to test them. You can also look at individual explanation plans for both queries to get an idea of ​​which one will work best. I just guess, but it looks like the update route will work well.

In addition, the amount of expected data matters. It could work perfectly in one direction, but after a million records, another way could be a way. If you have a test environment, that would be good to check.

0
source

1) It depends on the level of the application in order to ensure consistency and, as such, there will be a chance that you will find yourself in inconsistency. The questions I would ask are: how important is consistency in this case and how likely is it that there will be a lot of inconsistency? My thought is that being turned off by a single follower is not as important as making your endless scroll loading as fast as possible to improve user experience.

2) It is probably worth looking at performance, but if I had to guess, I would say that this approach would be slow.

0
source

Source: https://habr.com/ru/post/1012871/


All Articles