SQL Select all rows in which a subset exists

I am sure there is an answer to this question, but do not forget that I am new to SQL and do not know how to ask a question.

I have data similar to this (this, for example, is abbreviated). This is in postgres db.

table1 id value 1 111 1 112 1 113 2 111 2 112 2 116 3 111 3 122 3 123 4 126 5 123 5 125 6 111 6 112 6 116 table2 value 111 112 116 

I need to return the identifier of table1, where all the values ​​in table2 exist in the values ​​of table1. So for this example, my query will return 2 and 6.

Is there any way to do this in SQL? Or could you lead me to a data structure that would allow me to get this result? I can change the structure of the table to satisfy the ultimate need for this result.

Thank you very much. The answer to this question would be life saving.

+4
source share
4 answers

Consider this demo:

 CREATE TEMP TABLE table1(id int, value int); INSERT INTO table1 VALUES (1,111),(1,112),(1,113) ,(2,111),(2,112),(2,116) ,(3,111),(3,122),(3,123) ,(4,126) ,(5,123),(5,125) ,(6,111),(6,112),(6,116); CREATE TEMP TABLE table2(value int); INSERT INTO table2 VALUES (111) ,(112) ,(116); SELECT t1.id FROM table1 t1 JOIN table2 t2 USING (value) GROUP BY t1.id HAVING count(*) = (SELECT count(*) FROM table2) ORDER BY t1.id; 

Result:

 id ----- 2 6 

Returns all the identifiers table1 that appear with all the values ​​provided by table2 once.
Works for any number of rows in both tables.

If duplicate rows appear in table1 , do this:

 HAVING count(DISTINCT value) = (SELECT count(*) FROM table2) 
+6
source

It seems to me that whatever you want, you want to know how to ask the right question. The magic words here are "relational separation."

This is one of the operators in Codd relational algebra , and since then several options have been proposed. More recently, Chris Date has proposed replacing the entire concept with image relationships .

SQL does not have an explicit division operator. There are a number of workarounds using another operator, and the most suitable one will depend on your requirements, including the exact division or division with the remainder and how to handle the empty divider. Then there are the usual considerations: the product and version of SQL, performance, individual style and taste, etc.

Here are some articles to help you with these options:

How to make a relational section understandable

Divided We Stand: SQL Relational Department

+3
source

UPDATE Another possibility:

 SELECT t1.id FROM (SELECT t1.id, t1.value FROM table1 t1 JOIN table2 t2 USING (value) GROUP BY t1.id, t1.value ORDER BY t1.id) t1 GROUP BY t1.id HAVING COUNT(*) = (SELECT COUNT(*) FROM table2) 

The cost of my answer, if you use EXPLAIN ANALYZE, is always 893-900, even with duplicate lines.

+1
source

DOES NOT EXIST (... DOES NOT EXIST) - standard solution for relational division:

 SELECT DISTINCT id FROM table1 t1 WHERE NOT EXISTS ( SELECT * FROM table2 t2 WHERE NOT EXISTS ( SELECT * FROM table1 t1x WHERE t1x.value = t2.value AND t1x.id = t1.id ) ) ; 

In this case, DISTINCT is required, because we do not have access to the domain table with id s, only to the connection table t1 referencing it.

0
source

Source: https://habr.com/ru/post/1385265/


All Articles