Ok, I have a relationship that stores two keys, a product identifier and an attribute identifier. I want to find out which product is most similar to this product. (Attributes are actually numbers, but this makes the example more confusing, so they were changed to letters to simplify the visual presentation.)
Prod_att
Product | Attributes 1 | A 1 | B 1 | C 2 | A 2 | B 2 | D 3 | A 3 | E 4 | A
This initially seems pretty straightforward, just select the attributes that the product has, and then count the number of attributes per total product. The result of this is then compared with the number of attributes that the product has, and I see how the two products are similar. This works for products with more attributes than their compared products, but problems arise when products have very few attributes. For example, product 3 will have a tie for almost any other product (since A is very common).
SELECT Product, count(Attributes) FROM Prod_att WHERE Attributes IN (SELECT Attributes FROM prod_att WHERE Product = 1) GROUP BY Product ;
Any suggestions to fix this or improve my current request?
Thanks!
* edit: Product 4 will return count () = 1 for all products. I would like to show that product 3 is more similar as it has fewer differences.
source share