I have a set of cities that have many-to-many relationships with a set of tags. The user gives me a set of tags (which may contain duplicates!), And I need to return a list of relevant records, sorted by relevance.
Data
Here are some sample data to illustrate the problem:
Cities:
-------------------- | id | city | -------------------- | 1 | Atlanta | | 2 | Baltimore | | 3 | Cleveland | | 4 | Denver | | 5 | Eugene | -------------------- 
Tags:
 ------ | id | ------ | 1 | | 2 | | 3 | | 4 | ------ 
Cities are marked as follows:
 Atlanta: 1, 2 Baltimore: 3 Cleveland: 1, 3, 4 Denver: 2, 3 Eugene: 1, 4 
... so the CityTags table looks like this:
 ------------------------ | city_id | tag_id | ------------------------ | 1 | 1 | | 1 | 2 | | 2 | 3 | | 3 | 1 | | 3 | 3 | | 3 | 4 | | 4 | 2 | | 4 | 3 | | 5 | 1 | | 5 | 4 | ------------------------ 
Example 1
If the user gives me identifier tags: [1, 3, 3, 4], I want to calculate how many matches I have for each of the tags, and return the result, sorted by relevance, for example:
 ------------------------ | city | matches | ------------------------ | Cleveland | 4 | | Baltimore | 2 | | Eugene | 2 | | Atlanta | 1 | | Denver | 1 | ------------------------ 
Since Cleveland matched all four tags, it was first followed by Baltimore and Eugene, each of which had two tags, etc.
Example 2
Another example for a good grade. To search [2, 2, 2, 3, 4] we get:
 ------------------------ | city | matches | ------------------------ | Denver | 4 | | Atlanta | 3 | | Cleveland | 2 | | Baltimore | 1 | | Eugene | 1 | ------------------------ 
SQL
If I ignore duplicate tags, then this is trivial:
 SELECT name,COUNT(name) AS relevance FROM (SELECT name FROM cities,citytags WHERE id=city_id AND tag_id IN (1,3,3,4)) AS matches GROUP BY name ORDER BY relevance DESC; 
But that is not what I need. I need to respect duplicates. Can anyone suggest how I can do this?
Solution in Postgresql
Yeah! I need a temporary table. Postgresql allows me to do this using the WITH syntax. Here's the solution:
 WITH search(tag) AS (VALUES (1), (3), (3), (4)) SELECT name, COUNT(name) AS relevance FROM cities INNER JOIN citytags ON cities.id=citytags.city_id INNER JOIN search ON citytags.tag_id=search.tag GROUP BY name ORDER BY relevance DESC; 
Many thanks to those who answered.