Excluding MYSQL query results using INNER JOIN

I have two tables. The first is full of books, each with book_id . The second table is the book_id to keyword_id relationship table.

 SELECT b.* FROM books_table b INNER JOIN keywords_table k ON b.book_id = k.book_id AND k.keyword_id NOT IN(1,2,3) WHERE b.is_hardcover = 1 GROUP BY b.book_id 

Desired Result

There are no books with keywords 1, 2, or 3 attached to any of the books.

Actual result

Books can have keywords 1, 2, or 3 if they include additional keyword_ids that are not on the exclusion list.

What i tried

The above query is the closest I came to, but it does not cope with this.

How can I achieve the desired result in the most optimized way?

+5
source share
4 answers

You can do it

 SELECT b.* FROM books_table b INNER JOIN keywords_table k ON b.book_id = k.book_id WHERE b.is_hardcover = 1 GROUP BY b.book_id HAVING SUM(k.keyword_id = 1) =0 AND SUM(k.keyword_id = 2) =0 AND SUM(k.keyword_id = 3) =0 
+4
source

As you noted, this query will contain any book that has at least one keyword that is not 1, 2, or 3, which is not what you want. Instead, you want to explicitly exclude books with these keywords. A join is actually not what you need to work here. Instead, you can use the exists statement:

 SELECT b.* FROM books_table b WHERE b.is_hardcover = 1 AND NOT EXISTS (SELECT * FROM keywords_table k WHERE b.book_id = k.book_id AND k.keyword_id IN (1,2,3)) 
+3
source

You can use the following query:

 SELECT * FROM books_table WHERE is_hardcover = 1 AND book_id NOT IN (SELECT book_id FROM keywords_table GROUP BY book_id HAVING COUNT(CASE WHEN keyword_id IN (1,2,3) THEN 1 END) <> 0) 

Demo here

+1
source

What you are asking for is the "anti join" fragrance. There are several ways to achieve it; here is one:

 SELECT b.* FROM books_table b LEFT JOIN keywords_table k ON b.book_id = k.book_id AND k.keyword_id IN (1,2,3) WHERE k.book_id IS NULL AND b.is_hardcover = 1 

The left join matches each row from the left table ( books_table ) with those rows in the right table that satisfy the condition b.book_id = k.book_id AND k.keyword_id IN (1,2,3) , and includes a single result row for each row left table, t matches any row of the right table. The filter condition k.book_id IS NULL conflicts with the join condition, so it can be satisfied only by those lines that arise from the left line that does not correspond to any right line.

Note that assigning conditions to a join predicate and a filter predicate is critical with an outer join such as this. Also note that in this case there is no need for a GROUP BY if books_table cannot contain a duplicate book_id s.

This approach is likely to be better in practice than based on the correlated subquery in the WHERE . However, if performance is important, then it will be useful for you to check out the alternatives that you are considering.

+1
source

Source: https://habr.com/ru/post/1232995/


All Articles