The algorithm for choosing the most popular places from the database

We have a site containing a database of places. For each place, our users can perform one of the following actions, which we record:

  • VIEW - Viewing Profile
  • RATING - Rate it on a scale of 1 to 5 stars.
  • REVIEW - View it
  • COMPLETED - note that they were there
  • LIST OF USERS - Mark what they want there.
  • FAVORITES - note that this is one of their favorites

Our table of places tables for each place contains a counter of the number of times each action was performed above, as well as the average rating specified by users.

  • view
  • ratings
  • avg_rating
  • completed
  • Wish List
  • favourite

What we want to do is generate lists of top places using the information above. Ideally, we would like to be able to generate this list using a relatively simple SQL query without the need to do any work to calculate additional fields or places of stack ranks against each other. However, since we only have about 50,000 places, we could run the cron night task to compute some fields, such as ranking by different categories, if it had a significant difference in the overall results of our best places.

I would appreciate if you could make some suggestions on how we should think about how to bubble in the best places on top, what criteria we should weigh more, and given this information, tell me what the MySQL query should look like to choose the top 10 places.

It should be noted that at present we are less concerned that popularity takes a popular place - this means that viewing the aggregate information is in order and that more recent data do not need to be weighted more intensively.

Thank you in advance for your help and advice!

+4
source share
3 answers

The exact formula for the pet should have come from you, let's call it f (x).
For the actual implementation, I would add the popular_score field, which I would calculate, as you said from the cron night job, using f (x) for each line.

Then this is just the case when you need to "select a place name from the order by popular_score desc table".

Ok - give it a hit popular_score = (FAVORITES * 3 + COMPLETED * 2 + WISHLIST) * RATING * VIEW / AVG_VIEWS_OF_ALL_PROFILES

+1
source

I have no opinion on how to weigh things.

However, why not just add a popularity column to your location table? Suddenly your SQL query is incredibly simple.

The hard part, of course, is figuring out how and when to update this value. But since you save all activity data, you can always restore popularity values โ€‹โ€‹from journal entries.

This way you get good quick queries for the โ€œmost popularโ€ locations, and if you want to change the way you calculate popularity, you can do so on your own.

If you are smart, you can develop a simple enough formula so that popularity can be tracked in real time. For example, if we are only interested in average ratings, you can change the average rating with just three variables: the current average rating, the number of object ratings and the new rating value.

Of course, things get more complicated when you start mixing how many times an object has been viewed, viewed, loved, etc .... but you may find that you can develop a method that is computationally cheap enough so that you can update the overall value popularity for almost every action.

+1
source

Source: https://habr.com/ru/post/1307907/


All Articles