Find rows that have the same value in one column and other values ​​in another column?

I have a PostgreSQL database that stores users in the users table and the conversations that they accept in the conversation table. Since each user can participate in several conversations, and each conversation can include several users, I have a conversation_user link table to track which users are involved in each conversation:

 # conversation_user id | conversation_id | user_id ----+------------------+-------- 1 | 1 | 32 2 | 1 | 3 3 | 2 | 32 4 | 2 | 3 5 | 2 | 4 

In the above table, user 32 has one conversation only with user 3, and the other with 3 and 4 users. How would I write a query that would show that there is a conversation between just user 32 and user 3?

I tried the following:

 SELECT conversation_id AS cid, user_id FROM conversation_user GROUP BY cid HAVING count(*) = 2 AND (user_id = 32 OR user_id = 3); SELECT conversation_id AS cid, user_id FROM conversation_user GROUP BY (cid HAVING count(*) = 2 AND (user_id = 32 OR user_id = 3)); SELECT conversation_id AS cid, user_id FROM conversation_user WHERE (user_id = 32) OR (user_id = 3) GROUP BY cid HAVING count(*) = 2; 

These queries raise an error indicating that user_id should appear in the GROUP BY or be used in an aggregate function. Their inclusion in the aggregate function (for example, MIN or MAX ) is not suitable. I thought my first two attempts put them in a GROUP BY .

What am I doing wrong?

+5
source share
4 answers

This is a case of relational division . We have put together an arsenal of methods on this related subject:

A particular difficulty is to exclude additional users. There are basically 4 methods.

I suggest LEFT JOIN / IS NULL :

 SELECT cu1.conversation_id FROM conversation_user cu1 JOIN conversation_user cu2 USING (conversation_id) LEFT JOIN conversation_user cu3 ON cu3.conversation_id = cu1.conversation_id AND cu3.user_id NOT IN (3,32) WHERE cu1.user_id = 32 AND cu2.user_id = 3 AND cu3.conversation_id IS NULL; 

Or NOT EXISTS :

 SELECT cu1.conversation_id FROM conversation_user cu1 JOIN conversation_user cu2 USING (conversation_id) WHERE cu1.user_id = 32 AND cu2.user_id = 3 AND NOT EXISTS ( SELECT 1 FROM conversation_user cu3 WHERE cu3.conversation_id = cu1.conversation_id AND cu3.user_id NOT IN (3,32) ); 

Both queries are independent of the UNIQUE for (conversation_id, user_id) , which may or may not be in place. Meaning, the query even works if user_id 32 (or 3) is listed more than once for the same conversation. However, as a result, you get duplicate rows and must apply DISTINCT or GROUP BY .
The only condition you have formulated:

... a request that shows that there is a conversation between just user 32 and user 3?

Audited Request

the request linked in the comment will not work. You forgot to exclude other members. It should be something like:

 SELECT * -- or whatever you want to return FROM conversation_user cu1 WHERE cu1.user_id = 32 AND EXISTS ( SELECT 1 FROM conversation_user cu2 WHERE cu2.conversation_id = cu1.conversation_id AND cu2.user_id = 3 ) AND NOT EXISTS ( SELECT 1 FROM conversation_user cu3 WHERE cu3.conversation_id = cu1.conversation_id AND cu3.user_id NOT IN (3,32) ); 

Which is similar to the other two queries, except that it will not return multiple rows if user_id = 3 is bound multiple times.

+4
source

You can use conditional aggregation to select all codes that have only 2 specific members.

 select cid from conversation_user group by cid having count(*) = 2 and count(case when user_id not in (32,3) then 1 end) = 0 

If (cid,user_id) not unique, replace having count(*) = 2 with having count(distinct user_id) = 2

+1
source

Since you only need conversations with two users, you can use your own external connection for other users and filter out hits:

To find all dual-user conversations, and they are between:

 SELECT a.conversation_id cid, a.user_id user_id_1, b.user_id user_id_2 FROM conversation_user a JOIN conversation_user b ON b.cid = a.cid AND b.user_id > a.user_id LEFT JOIN conversation_user c ON c.cid = a.cid AND c.user_id NOT IN (a.user_id, b.user_id) WHERE c.cid IS NULL -- only return misses on join to others 

To find all 2-user conversations for a specific user, simply add:

 AND a.user_id = 32 
0
source

if you just want to confirm.

  select conversation_id from conversation_users group by conversation_id having bool_and ( user_id in (3,32)) and count(*) = 2; 

if you want to get complete information, you can use the window function and CTE, like this:

  with a as ( select * ,not bool_and( user_id in (3,32) ) over ( partition by conversation_id) and 2 = count(user_id) over ( partition by conversation_id) as conv_candidates from conversation_users ) select * from a where conv_candidates; 
0
source

Source: https://habr.com/ru/post/1210337/


All Articles