SQL to delete rows with the same set of values

Question

SQL to delete rows with the same set of values

I have one table like this:

ID, ItemsID 1 2 1 3 1 4 2 3 2 4 2 2

I want to remove tuples of type ID=2 , because 2,3,4 in my situation are the same as 3,4,2 .

How to do it using SQL?

+4

sql oracle

user1831190 Nov 17 '12 at 2:20

source share

3 answers

AndreKR · Answer 1 · 2012-11-17T02:47:45+0000

Unfortunately, I did not see the Oracle tag in time. However, I left the MySQL solution for reference. Apparently, in some versions of Oracle there is something like GROUP_CONCAT() .

This may not be the most elegant solution, but it will do the job:

 DELETE FROM t WHERE ID IN ( SELECT ID FROM (SELECT ID, GROUP_CONCAT(ItemsID ORDER BY ItemsID) AS tuple FROM t GROUP BY ID) AS tuples WHERE EXISTS ( SELECT TRUE FROM (SELECT ID, GROUP_CONCAT(ItemsID ORDER BY ItemsID) AS tuple FROM t GROUP BY ID) tuples2 WHERE tuples2.tuple = tuples.tuple AND tuples2.ID < tuples.ID ) )

SQLfiddle

You might want to configure group_concat_max_len .

Nick Krasnov · Answer 2 · 2012-11-17T23:11:07+0000

Another approach. Oracle 10gR1 or higher. In this example, we have the same set of ItemsID values for ID 1, 2, 6, another for 3, and another for 4 and 5. So, we are going to remove ID 2, 6 and 5, since they seem duplicate, representing each set ItemsID elements of a specific group ID in the form of a nested table and using the multiset except operator to determine whether the elements in the group are the same:

 -- set-up SQL> create table tb_table( 2 id number, 3 itemsid number); Table created SQL> insert into tb_table(id, itemsid) 2 select 1, 2 from dual union all 3 select 1, 3 from dual union all 4 select 1, 4 from dual union all 5 select 2, 4 from dual union all 6 select 2, 3 from dual union all 7 select 2, 2 from dual union all 8 select 3, 2 from dual union all 9 select 3, 3 from dual union all 10 select 3, 6 from dual union all 11 select 3, 4 from dual union all 12 select 4, 1 from dual union all 13 select 4, 2 from dual union all 14 select 4, 3 from dual union all 15 select 5, 1 from dual union all 16 select 5, 2 from dual union all 17 select 5, 3 from dual union all 18 select 6, 2 from dual union all 19 select 6, 4 from dual union all 20 select 6, 3 from dual; 19 rows inserted SQL> commit; Commit complete SQL> create or replace type t_numbers as table of number; 2 / Type created -- contents of the table SQL> select * 2 from tb_table; ID ITEMSID ---------- ---------- 1 2 1 3 1 4 2 4 2 3 2 2 3 2 3 3 3 6 3 4 4 1 4 2 4 3 5 1 5 2 5 3 6 2 6 4 6 3 19 rows selected SQL> delete from tb_table 2 where id in (with DataGroups as( 3 select id 4 , grp 5 , (select count(*) from table(grp)) cnt 6 from (select id 7 , cast(collect(itemsid) as t_numbers) grp 8 from tb_table 9 group by id 10 ) 11 ) 12 select distinct id2 13 from ( select dg1.id as id1 14 , dg2.id as id2 15 , (dg1.grp multiset except dg2.grp) res 16 , dg1.cnt 17 from DataGroups Dg1 18 cross join DataGroups Dg2 19 where dg1.cnt = dg2.cnt 20 order by dg1.id 21 ) t 22 where res is empty 23 and id2 > id1 24 ) 25 ; 9 rows deleted SQL> select * 2 from tb_table; ID ITEMSID ---------- ---------- 1 2 1 3 1 4 3 2 3 3 3 6 3 4 4 1 4 2 4 3 10 rows selected

a_horse_with_no_name · Answer 3 · 2012-11-17T12:15:29+0000

This is the best I can think of, but somehow I feel that there should be a simpler solution:

 delete from items where id in ( select id from ( with counts as ( select id, count(*) as cnt from items group by id ) select c1.id, row_number() over (order by c1.id) as rn from counts c1 join counts c2 on c1.id <> c2.id and c1.cnt = c2.cnt and not exists (select i1.itemsid from items i1 where i1.id = c1.id minus select i2.itemsid from items i2 where i2.id = c2.id) ) t where rn <> 1 );

It will work for any number of itemsid values.

rn <> 1 in combination with upward sorting in the window definition will contain the smallest identifier in the table (in your case 1 ). If you want to keep the highest value of the identifier, you need to change the sort order to over (order by c1.id desc)

SQL to delete rows with the same set of values

More articles: