Filter an SQL query by a unique set of column values, regardless of their order

I have a table in Oracle that contains two columns that I would like to query for records containing a unique combination of values, regardless of the order of these values. For example, if I have the following table:

create table RELATIONSHIPS ( PERSON_1 number not null, PERSON_2 number not null, RELATIONSHIP number not null, constraint PK_RELATIONSHIPS primary key (PERSON_1, PERSON_2) ); 

I would like to request all the unique relationships. Therefore, if I have a record PERSON_1 = John and PERSON_2 = Jill, I do not want to see another record where PERSON_1 = Jill and PERSON_2 = John.

Is there an easy way to do this?

+4
source share
10 answers

There is some uncertainty as to whether you want to prevent duplicate inserts into the database. You can simply select unique pairs while keeping duplicates.

So, here is an alternative solution for the latter case, requesting unique pairs, even if duplicates exist:

 SELECT r1.* FROM Relationships r1 LEFT OUTER JOIN Relationships r2 ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1) WHERE r1.person_1 < r1.person_2 OR r2.person_1 IS NULL; 

So, if there is a corresponding line with a changed id id, there is a rule for which a request is preferable (one that has an identifier in numerical order).

If there is no corresponding line, then r2 will be NULL (this is how the outer join works), so just use what is found in r1 in this case.

No need to use GROUP BY or DISTINCT , because there can only be zero or one matching row.

Trying this in MySQL, I get the following optimization plan:

 +----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+ | 1 | SIMPLE | r1 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | r2 | eq_ref | PRIMARY | PRIMARY | 8 | test.r1.person_2,test.r1.person_1 | 1 | Using where; Using index | +----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+ 

This is apparently a pretty good use of indexes.

+1
source

Is the attitude always in both directions? those. if John and Jill are connected to each other, is there always {John, Jill} and {Jill, John}? If so, just restrict those where Person_1 <Person_2 and take a separate set.

+4
source
 select distinct case when PERSON_1>=PERSON_2 then PERSON_1 ELSE PERSON_2 END person_a, case when PERSON_1>=PERSON_2 then PERSON_2 ELSE PERSON_1 END person_b FROM RELATIONSHIPS; 
+3
source

Unverified:

 select least(person_1,person_2) , greatest(person_1,person_2) from relationships group by least(person_1,person_2) , greatest(person_1,person_2) 

To prevent such duplicate entries, you can add a unique index using the same idea (proven!):

 SQL> create table relationships 2 ( person_1 number not null 3 , person_2 number not null 4 , relationship number not null 5 , constraint pk_relationships primary key (person_1, person_2) 6 ) 7 / Table created. SQL> create unique index ui_relationships on relationships(least(person_1,person_2),greatest(person_1,person_2)) 2 / Index created. SQL> insert into relationships values (1,2,0) 2 / 1 row created. SQL> insert into relationships values (1,3,0) 2 / 1 row created. SQL> insert into relationships values (2,1,0) 2 / insert into relationships values (2,1,0) * ERROR at line 1: ORA-00001: unique constraint (RWIJK.UI_RELATIONSHIPS) violated 

Regards, Rob.

+3
source

You must create a constraint on the Relationships table so that the numerical value of person_1 less than the numerical value of person_2 .

 create table RELATIONSHIPS ( PERSON_1 number not null, PERSON_2 number not null, RELATIONSHIP number not null, constraint PK_RELATIONSHIPS primary key (PERSON_1, PERSON_2), constraint UNIQ_RELATIONSHIPS CHECK (PERSON_1 < PERSON_2) ); 

This way you can be sure that (2.1) can never be inserted - it should be (1,2). Then the PRIMARY KEY constraint will prevent duplicates.

PS: I see that Mark Gravell answered faster than me with a similar solution.

+2
source

I think something like this should do the trick:

 select * from RELATIONSHIPS group by PERSON_1, PERSON_2 
0
source

I think that KM is almost all right, I added concat.

 SELECT DISTINCT * FROM (SELECT DISTINCT concat(Person_1,Person_2) FROM RELATIONSHIPS UNION SELECT DISTINCT concat(Person_2, Person_1) FROM RELATIONSHIPS ) dt 
0
source

it is kludgy, as hell, but it will at least tell you what unique combinations you have, it’s just not very convenient ...

 select distinct(case when person_1 <= person_2 then person_1||'|'||person_2 else person_2||'|'||person_1 end) from relationships; 
0
source

Perhaps the simplest solution (which does not require changing the data structure or creating triggers) is to create a result set without duplicate records and add one of the repeated records to this set.

will look something like this:

  select * from relationships where rowid not in (select a.rowid from relationships a,relationships b where a.person_1=b.person_2 and a.person_2=b.person_1) union all select * from relationships where rowid in (select a.rowid from relationships a,relationships b where a.person_1=b.person_2 and a.person_2=b.person_1 and a.person_1>a.person_2) 

But usually I never create a table without a primary key with one column.

0
source

You could just

  with rel as (
 select *,
        row_number () over (partition by least (person_1, person_2), 
                                        greatest (person_1, person_2)) as rn
   from relationships
        )
 select *
   from rel
  where rn = 1;
0
source

Source: https://habr.com/ru/post/1286033/


All Articles