PostgreSQL - Array Order

I have 2 tables - a course that contains the id and the name of the courses and tagCourse that contain tags for each course.

course tagcourse ------------ ---------------- PK id_course PK tag name PK, FK id_course 

I would like to write a function that searches for courses by a given array of tags and returns them by the number of matching tags. However, I do not know how to write correctly and efficiently. Please help me.

t

 CREATE OR REPLACE FUNCTION searchByTags(tags varchar[]) RETURNS SETOF..... RETURN QUERY SELECT * FROM course c INNER JOIN tagcourse tc ON c.id_course = tc.id_course WHERE ??? ORDER BY ??? END.... 
+6
sql sql-order-by postgresql aggregate set-returning-functions
Mar 27 '13 at 16:40
source share
2 answers
 CREATE OR REPLACE FUNCTION search_by_tags(tags varchar[]) RETURNS TABLE (id_course integer, name text, tag_ct integer) AS $func$ SELECT id_course, c.name, ct.tag_ct FROM ( SELECT tc.id_course, count(*)::int AS tag_ct FROM unnest($1) x(tag) JOIN tagcourse tc USING (tag) GROUP BY 1 -- first aggregate .. ) AS ct JOIN course c USING (id_course) -- .. then join ORDER BY ct.tag_ct DESC -- more columns to break ties? $func$ LANGUAGE sql; 
  • Use unnest() to create a table from your input array, for example , @Clodoaldo already demonstrated .

  • You do not need plpgsql for this. Simplified with a simple SQL function.

  • I use unnest($1) (with a positional parameter) instead of unnest(tags) , since the later version is only valid for PostgreSQL 9.2+ in SQL functions (unlike plpgsql). I quote a guide here :

In the earlier digital approach, arguments are referenced using the syntax $n : $1 refers to the first input argument, $2 to the second, and so on. This will work regardless of whether there was a specific argument with a name.

  • count() returns bigint . You need to pass it to int to match the declared return type, or declare the returned column as bigint to start with.

  • A great opportunity to simplify the syntax with USING (equi-join): USING (tag) instead of ON tc.tag = c.tag .

  • It regularly runs faster to fill in first and then join another table. Reduces necessary connection operations.
    For the @Clodoaldo question in the comments , here's a SQL Fiddle to demonstrate the difference.

  • OTOH, if you aggregate after merging, you do not need a subquery. In short, but probably slower:

 SELECT c.id_course, c.name, count(*)::int AS tag_ct FROM unnest($1) x(tag) JOIN tagcourse tc USING (tag) JOIN course c USING (id_course) GROUP BY 1 ORDER BY 3 DESC; -- more columns to break ties? 
+4
Mar 28 '13 at 5:00
source share
 create or replace function searchByTags(tags varchar[]) returns table (id_course integer, name text, quantitiy integer) as $$ select * from ( select c.id_course, c.name, count(*) quantity from course c inner join tagcourse tc on c.id_course = tc.id_course inner join unnest(tags) s(tag) on s.tag = tc.tag group by c.id_course, c.name ) s order by quantity desc, name ; $$ language sql; 
0
Mar 27 '13 at 16:59
source share



All Articles