How to rate or rate votes

I am very sorry if I am mistaken in my question, but I need an idea ... I want to have the idea of ​​a ranking algorithm with the inclusion of time when they cast their votes there.

+4
source share
2 answers

Good question!

Ok, let's turn it on!

First of all, you cannot when calculating good Bayesianaverage ratings

You launched it, but it is very simplified, it takes care of the following:

  • Records with a small number of votes are not a true means of voting, but they have an average rating component throughout your data set. For example, in IMDB, the default rating is somewhere in 6.4. Thus, a film with 2 votes, each of which had 10 stars, can still have something between 6 and 7. The more votes, the more they become completely, and the rating is “removed” from default. Imdb also implements a minimum number of votes for its films for display on lists.

Another thing that I find confused: why is voting time important? Didn't you mean the entry time you voted for? So, in our films, only the released films are more important?

But anyway! In both cases, good results are often achieved by using logarithmic functions.

For our movie example, the movie can be increased by

1 + 1/SQRT(1 + CURRENT_YEAR - RELEASE_YEAR ) 

So, 1 is the socket rating that every movie receives. A movie from the current year will have a 100% increase (200% relevance), since above will return true. Last year, 170%, 2 years 157% and a son.

But the difference in the film from 1954 or 1963 is far from so great.

So remember:

  • Everything that you use in your calculations. Is it really linear? Could this distort your ratings? Are relationships across the dataset normal?

If you want the latest voices to cast more, you can do it just like weighing your voices. This also makes sense if you want the latest voting materials to be “warmed up” ... Because they are currently hot and are being discussed in your community, for example.

This suggests that this is just hard work. Lots of games, etc.

Let me give you a final idea.

In the company I work for, we calculate relevance for films.

We have a configuration array in which we keep the “weighting” of several factors in the final relevance.

It looks like this:

  $weights = array( "year" => 2, // release year "rating" =>13, // rating 0-100 "cover" => 4, // cover available? "shortdescription" => 4, // short descr available? "trailer" => 3, // trailer available? "imdbpr" => 13, // google pagerank of imdb site ); 

Then we calculate a value from 0 to 1 for each metric. There are different methods. But let me show you an example of our rating (which in itself is an aggregated rating of several platforms that we scan and which have different weight values, etc.).

  $yearDiff = $data["year"] - date('Y'); //year if (!$data["year"]){ $values['year'] = 0; } else if($yearDiff==0) { $values['year'] = 1; } else if($yearDiff < 3) { $values['year'] = 0.8; } else if($yearDiff < 10) { $values['year'] = 0.6; } else { $values['year'] = 1/sqrt(abs($yearDiff)); } 

So you see that we hardcoded some “age intervals” and relied on the sqrt function only for older movies. In fact, the difference is minimal there, so the SQRT example is very low. But math functions are very often useful!

You can, for example, also use periodic functions, such as sine curves, etc., to calculate seasonal relevance! For example, your year has a range from 0 to 1, then you can use the sine function to weight summer hits / winter hits / autumn hits for the current time of the year!

Last example for pagerank IMDB. It is completely hardcoded, since there are only 10 different values, and they are not distributed in a statistically uniform way (pagerank 1 or 2 is even worse than none):

  if($imdbpr >= 7) { $values['imdbpr'] = 1; } else if($imdbpr >= 6) { $values['imdbpr'] = 0.9; } else if($imdbpr >= 5) { $values['imdbpr'] = 0.8; } else if($imdbpr >= 4) { $values['imdbpr'] = 0.6; } else if($imdbpr >= 3) { $values['imdbpr'] = 0.5; } else if($imdbpr >= 2) { $values['imdbpr'] = 0.3; } else if($imdbpr >= 1) { $values['imdbpr'] = 0.1; } else if($imdbpr >= 0) { $values['imdbpr'] = 0.0; } else { $values['imdbpr'] = 0.4; // no pagerank available. probably new } 

Then summarize as follows:

  foreach($values as $field=>$value) { $malus += ($value*$weights[$field]) / array_sum($weights); } 

This may not be the exact answer to your question, but a little more on the whole, but I hope I pointed you in the right direction and gave you a few moments when your thoughts can pick up!

Good luck and have fun with your application!

+3
source

Reddit code is open source. Their ranking algorithm with code is pretty well discussed here: http://amix.dk/blog/post/19588

+1
source

Source: https://habr.com/ru/post/1338755/


All Articles