Matching users who viewed the same restaurants

I am working on an algorithm that calculates the similarity coefficient between restaurants. Before we can calculate the similarity mentioned, I need 3 users who rated both restaurants. Here is a table with a possible scenario as an example:

| Restaurant 1 | Restaurant 2 User 1 | X | 2 User 2 | 1 | 5 User 3 | 4 | 3 User 4 | 2 | 1 User 5 | X | 5 

Here X not subject to review, and ratings are not user reviews for the restaurant. You can see that similarities can be calculated because users 2, 3, and 4 rated both restaurants.

Because I use the adjusted cosine similarity. I need the average value of ratings from each user.

Now I get a list of all restaurants and a double cycle to check if the similarity between the restaurants can be calculated.

I use the following double for loop to check if this is possible:

 for (int i = 0; i < allRestaurants.Count; i++) for (int j = 0; j < allRestaurants.Count; j++) if (i < j) matrix.Add(new Similarity() { Id = Guid.NewGuid(), FirstRest = allRestaurants[i], SecondRest = allRestaurants[j], Sim = ComputeSimilarity(allRestaurants[i], allRestaurants[j], allReviews) }); 

Inside ComputeSimilarity I use the following LINQ statement to check the number of matches:

 public double ComputeSimilarity(Guid restaurant1, Guid restaurant2, IEnumerable<Tuple<List<Review>, double>> allReviews) { //The double in the list of allReviews is the average rating of the user. var matches = (from R1 in allReviews.SelectMany(x => x.Item1).Where(x => x.RestaurantId == subject1) from R2 in allReviews.SelectMany(x => x.Item1).Where(x => x.RestaurantId == subject2) where R1.UserId == R2.UserId select Tuple.Create(R1, R2, allReviews.Where(x => x.Item1.FirstOrDefault().UserId == R1.UserId) .Select(x => x.Item2) .FirstOrDefault())) .DistinctBy(x => x.Item1.UserId); int amountOfMatches = matches.Count(); //Don't mind this, not looking for performance here at the moment. if (amountOfMatches < 4) return 0; 

Now you can see that this approach is very difficult, and it takes a lot of time when you increase the number of restaurants for the double for cycle.

I decided that the best approach would be to get all the restaurants that already meet this requirement, but I was fixated on how to do this. I think you can get a list of "matches", which will be a list of tuples, which will look like this: Tuple<Review, Review, double> . Where these reviews will be from the same user, and double is the average rating of reviews from the user.

I try several attempts, but I get hung up all the time when I want to add a condition in which I need to search only restaurants with 3 matches.

For reference, my review object is as follows:

 public class Review { [DatabaseGenerated(DatabaseGeneratedOption.Identity)] public virtual Guid Id { get; set; } public virtual int Rating { get; set; } public virtual Guid RestaurantId { get; set; } public virtual Guid UserId { get; set; } //More irrelevant attributes here } 

And my restaurant object:

 public class Restaurant { [DatabaseGenerated(DatabaseGeneratedOption.Identity)] public virtual Guid Id { get; set; } //More irrelevant attributes here } 

I am looking for something that is better than my current approach, is there anyone who can point me in the right direction or suggest a better approach? Also, if you need more information, let me know! Thanks in advance!

Edit: The first example shows two restaurants, but the list could be longer, of course. The fact is that I want only restaurants for which you can calculate the similarities.

So take the following example:

  | Restaurant 1 | Restaurant 2 | Restaurant 3 User 1 | X | 2 | X User 2 | 1 | 5 | X User 3 | 4 | 3 | 3 User 4 | 2 | 1 | 2 User 5 | X | 5 | X User 6 | X | X | 2 

The only possible match is between restaurant 1 and restaurant 2. Since there are not enough matches (in this case at least 3), it is impossible to calculate the similarity. Thus, to optimize this, you need to create a list of restaurants where you can calculate the similarities.

To explain further, match , where 2 users rated both restaurants. Restaurant 3 has 3 reviews, but only 2 of them are matches, as User 6 rated this restaurant only.

So, if we give 3 restaurants above as input, he should create a list of restaurants for which the similarity can be calculated (in this case, only restaurant 1 and 2).

Edit 2: I will add an example of what my desired result should look like:

โ€œMatchโ€ is a place where at least 3 users have rated the same 2 restaurants. So, let's say we have a restaurant X and Y, the output may look like this:

  | Restaurant X | Restaurant Y User 1 | 5 | 3 User 2 | 2 | 5 User 3 | 1 | 2 

Now, if we added a third restaurant to the list, which each user also viewed:

  | Restaurant X | Restaurant Y | Restaurant Z User 1 | 5 | 3 | 2 User 2 | 2 | 5 | 3 User 3 | 1 | 2 | 1 

Now you can see how you can create similarities between each restaurant here. Similarities between X and Y, X and Z, Y and Z.

This can be modeled in a separate class as follows:

 public class Match { public Review rev1 { get; set; } //These two reviews have been left by the same users, on separate restaurants. public Review rev2 { get; set; } } 

If we have 3 of these matches, each of which has the same RestaurantId from rev1 and the same RestaurantId from rev2.

So, a list of these matches might look like this:

  • Match 1: rev1.RestaurantId = 1 | rev2.RestaurantId = 2 | UserId = 11 This UserId is the same on rev1 and rev2
  • Match 2: rev1.RestaurantId = 1 | rev2.RestaurantId = 2 | UserId = 12 This UserId is the same on rev1 and rev2
  • Match 3: rev1.RestaurantId = 1 | rev2.RestaurantId = 2 | UserId = 13 This UserId is the same on rev1 and rev2

I know that identifiers are guides, but this is purely an example.

I hope this made sense.

+5
source share
2 answers

I think I did what you are trying to achieve.

I created a database with a review table in your message, and I supplied the same data as the table that you show us in the Edit .


Step 1

So, I first RestaurantId group with values โ€‹โ€‹as a list of all user reviews that have tariffs for this restaurant.

This gives us the following:

enter image description here

Step 2

Exclude restaurants that have less than 3 user reviews with reviews of less than 2 restaurants.

This gives us the following:

enter image description here

Step 3

We have the correct list of restaurants, but we need to exclude user reviews that do not match. Then flat it all has only restaurants and reviews.

This gives us this: this is the end result :

enter image description here


Here is the code:

 var matches = this.Reviews.GroupBy(r => r.RestaurantId, r => this.Reviews.Where(rr => rr.UserId == r.UserId)) .ToList() .Where(g => g.Where(gg => gg.Count() >= 2).Count() >= 3); var matchingReviewsByRestaurant = matches.ToDictionary(m => m.Key, m => m.Where(g => g.Count() >= 2).SelectMany(g => g)); 

Hope this is what you wanted!


Edit: final answer

The final answer, so here is what you want, a pair of user reviews.

 // Step 1 : Get the right reviews var matches = this.Reviews.GroupBy(r => r.RestaurantId, r => this.Reviews.Where(rr => rr.UserId == r.UserId)).ToList() .Where(g => g.Where(gg => gg.Count() >= 2).Count() >= 3); var matchingReviewsByRestaurant = matches.ToDictionary(m => m.Key, m => m.Where(g => g.Count() >= 2).SelectMany(g => g)); // Step 2 : Create the matching couples var reviewsByUsers = matchingReviewsByRestaurant.SelectMany(m => m.Value).Distinct().ToLookup(r => r.UserId); var matchingReviewsCouples = new List<Match>(); foreach (var reviews in reviewsByUsers) { var combinations = reviews.SelectMany(x => reviews, (x, y) => new Match(x, y)) .Where(m => m.Review1.Id.CompareTo(m.Review2.Id) > 0) .ToList(); matchingReviewsCouples.AddRange(combinations); } // Final Results are in matchingReviewsCouples 

And with the data from my example, here is the result:

enter image description here

+3
source

I would suggest the following ;, 1) If you do not have a huge number of users and restaurants, consider using Guid.Make the value of the int.Query id of int types faster.

2) Your input is just int. You can break data redundancy to speed up the request. You can find my sugesstion for changing data.

 <Similarity> <Users> <User id="25"> <Restaurants> <Restaurant id="1" rating="1"/> <Restaurant id="2" rating="2"/> </Restaurants> </User> <User id="26"> <Restaurants> <Restaurant id="1" rating="3"/> <Restaurant id="2" rating="5"/> </Restaurants> </User> </Users> <Restaurants> <Restaurant id="1"> <Users> <User id="25" rating="1"/> <User id="26" rating="3"/> </Users> </Restaurant> <Restaurant id="2"> <Users> <User id="25" rating="2"/> <User id="26" rating="5"/> </Users> </Restaurant> </Restaurants> </Similarity> 

EDIT
According to your structure, less than two methods may be useful.

 public List<Review> RestReviews(Guid rIdThatYouWantoMatch) { var reviewsOfOneRestaurant= this.Reviews.Where(r=> r.RestaurantId == rIdThatYouWantoMatch).ToList(); if( reviewsOfOneRestaurant.Count() < 3) { return null; } else { foreach (var review in reviewsOfOneRestaurant) { var user= this.Users.Where(u=> u.Id == review.UserId).SingleOrDefault(); if( this.Reviews.Where(r=> r.UserId == user.Id).Count() < 2) reviewsOfOneRestaurant.Remove(review); } return reviewsOfOneRestaurant; } } public List<Match> MatchReview(List<Review> one,List<Review> two) { List<Match> list=new List<Match>(); foreach (var review in one) { var review2= two.Where(r=>r.UserId == review.UserId).SingleOrDefault(); if( review2 != null) { Match match = new Match(); match.rev1=review; match.rev2=review2; } } return list; } 

For query performance,
1) You must add rateCount to the user table
2) consider restaurant-based storage. For each restaurant you should make a table.
You can find the suggested data structure below.

 <Similarity> <Users> <User id="1" rateCount="1"/> <User id="2" rateCount="2"/> <User id="3" rateCount="3"/> <User id="4" rateCount="3"/> <User id="5" rateCount="1"/> <User id="6" rateCount="1"/> </Users> <Restaurants> <Restaurant id="1" userCount="3"> <User id="2" rating="1"/> <User id="3" rating="4"/> <User id="4" rating="2"/> </Restaurant> <Restaurant id="2" userCount="5"> <User id="1" rating="2"/> <User id="2" rating="5"/> <User id="3" rating="3"/> <User id="4" rating="1"/> <User id="5" rating="5"/> </Restaurant> <Restaurant id="2" userCount="4"> <User id="2" rating="5"/> <User id="3" rating="3"/> <User id="4" rating="2"/> <User id="6" rating="2"/> </Restaurant> </Restaurants> </Similarity> 
-2
source

Source: https://habr.com/ru/post/1261980/


All Articles