Filter an SQL query by a unique set of column values, regardless of their order

Question

Filter an SQL query by a unique set of column values, regardless of their order

I have a table in Oracle that contains two columns that I would like to query for records containing a unique combination of values, regardless of the order of these values. For example, if I have the following table:

create table RELATIONSHIPS ( PERSON_1 number not null, PERSON_2 number not null, RELATIONSHIP number not null, constraint PK_RELATIONSHIPS primary key (PERSON_1, PERSON_2) );

I would like to request all the unique relationships. Therefore, if I have a record PERSON_1 = John and PERSON_2 = Jill, I do not want to see another record where PERSON_1 = Jill and PERSON_2 = John.

Is there an easy way to do this?

+4

sql oracle

Kevin babcock Jun 10 '09 at 19:24

source share

10 answers

Is the attitude always in both directions? those. if John and Jill are connected to each other, is there always {John, Jill} and {Jill, John}? If so, just restrict those where Person_1 <Person_2 and take a separate set.

+4

Marc gravell Jun 10 '09 at 19:34

source share

 select distinct case when PERSON_1>=PERSON_2 then PERSON_1 ELSE PERSON_2 END person_a, case when PERSON_1>=PERSON_2 then PERSON_2 ELSE PERSON_1 END person_b FROM RELATIONSHIPS;

+3

tekBlues Jun 10 '09 at 19:32

source share

Unverified:

 select least(person_1,person_2) , greatest(person_1,person_2) from relationships group by least(person_1,person_2) , greatest(person_1,person_2)

To prevent such duplicate entries, you can add a unique index using the same idea (proven!):

 SQL> create table relationships 2 ( person_1 number not null 3 , person_2 number not null 4 , relationship number not null 5 , constraint pk_relationships primary key (person_1, person_2) 6 ) 7 / Table created. SQL> create unique index ui_relationships on relationships(least(person_1,person_2),greatest(person_1,person_2)) 2 / Index created. SQL> insert into relationships values (1,2,0) 2 / 1 row created. SQL> insert into relationships values (1,3,0) 2 / 1 row created. SQL> insert into relationships values (2,1,0) 2 / insert into relationships values (2,1,0) * ERROR at line 1: ORA-00001: unique constraint (RWIJK.UI_RELATIONSHIPS) violated

Regards, Rob.

+3

Rob van Wijk Jun 10 '09 at 19:42

source share

You must create a constraint on the Relationships table so that the numerical value of person_1 less than the numerical value of person_2 .

 create table RELATIONSHIPS ( PERSON_1 number not null, PERSON_2 number not null, RELATIONSHIP number not null, constraint PK_RELATIONSHIPS primary key (PERSON_1, PERSON_2), constraint UNIQ_RELATIONSHIPS CHECK (PERSON_1 < PERSON_2) );

This way you can be sure that (2.1) can never be inserted - it should be (1,2). Then the PRIMARY KEY constraint will prevent duplicates.

PS: I see that Mark Gravell answered faster than me with a similar solution.

+2

Bill karwin Jun 10 '09 at 19:37

source share

I think something like this should do the trick:

 select * from RELATIONSHIPS group by PERSON_1, PERSON_2

0

Aistina Jun 10 '09 at 19:33

source share

I think that KM is almost all right, I added concat.

 SELECT DISTINCT * FROM (SELECT DISTINCT concat(Person_1,Person_2) FROM RELATIONSHIPS UNION SELECT DISTINCT concat(Person_2, Person_1) FROM RELATIONSHIPS ) dt

0

Mike nereson Jun 10 '09 at 19:41

source share

it is kludgy, as hell, but it will at least tell you what unique combinations you have, it’s just not very convenient ...

 select distinct(case when person_1 <= person_2 then person_1||'|'||person_2 else person_2||'|'||person_1 end) from relationships;

0

copaX Jun 10 '09 at 19:42

source share

Perhaps the simplest solution (which does not require changing the data structure or creating triggers) is to create a result set without duplicate records and add one of the repeated records to this set.

will look something like this:

  select * from relationships where rowid not in (select a.rowid from relationships a,relationships b where a.person_1=b.person_2 and a.person_2=b.person_1) union all select * from relationships where rowid in (select a.rowid from relationships a,relationships b where a.person_1=b.person_2 and a.person_2=b.person_1 and a.person_1>a.person_2)

But usually I never create a table without a primary key with one column.

0

Flatline75 Jun 11 '09 at 12:00

source share

You could just

  with rel as (
 select *,
        row_number () over (partition by least (person_1, person_2), 
                                        greatest (person_1, person_2)) as rn
   from relationships
        )
 select *
   from rel
  where rn = 1;

0

Scott swank Jun 12 '09 at 0:01

source share

Bill karwin · Accepted Answer · 2009-06-10T20:30:36+0000

There is some uncertainty as to whether you want to prevent duplicate inserts into the database. You can simply select unique pairs while keeping duplicates.

So, here is an alternative solution for the latter case, requesting unique pairs, even if duplicates exist:

 SELECT r1.* FROM Relationships r1 LEFT OUTER JOIN Relationships r2 ON (r1.person_1 = r2.person_2 AND r1.person_2 = r2.person_1) WHERE r1.person_1 < r1.person_2 OR r2.person_1 IS NULL;

So, if there is a corresponding line with a changed id id, there is a rule for which a request is preferable (one that has an identifier in numerical order).

If there is no corresponding line, then r2 will be NULL (this is how the outer join works), so just use what is found in r1 in this case.

No need to use GROUP BY or DISTINCT , because there can only be zero or one matching row.

Trying this in MySQL, I get the following optimization plan:

 +----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+ | 1 | SIMPLE | r1 | ALL | NULL | NULL | NULL | NULL | 2 | | | 1 | SIMPLE | r2 | eq_ref | PRIMARY | PRIMARY | 8 | test.r1.person_2,test.r1.person_1 | 1 | Using where; Using index | +----+-------------+-------+--------+---------------+---------+---------+-----------------------------------+------+--------------------------+

This is apparently a pretty good use of indexes.

Filter an SQL query by a unique set of column values, regardless of their order

More articles: