SQL to delete rows with the same set of values

I have one table like this:

ID, ItemsID 1 2 1 3 1 4 2 3 2 4 2 2 

I want to remove tuples of type ID=2 , because 2,3,4 in my situation are the same as 3,4,2 .

How to do it using SQL?

+4
source share
3 answers

Unfortunately, I did not see the Oracle tag in time. However, I left the MySQL solution for reference. Apparently, in some versions of Oracle there is something like GROUP_CONCAT() .


This may not be the most elegant solution, but it will do the job:

 DELETE FROM t WHERE ID IN ( SELECT ID FROM (SELECT ID, GROUP_CONCAT(ItemsID ORDER BY ItemsID) AS tuple FROM t GROUP BY ID) AS tuples WHERE EXISTS ( SELECT TRUE FROM (SELECT ID, GROUP_CONCAT(ItemsID ORDER BY ItemsID) AS tuple FROM t GROUP BY ID) tuples2 WHERE tuples2.tuple = tuples.tuple AND tuples2.ID < tuples.ID ) ) 

SQLfiddle

You might want to configure group_concat_max_len .

+1
source

Another approach. Oracle 10gR1 or higher. In this example, we have the same set of ItemsID values ​​for ID 1, 2, 6, another for 3, and another for 4 and 5. So, we are going to remove ID 2, 6 and 5, since they seem duplicate, representing each set ItemsID elements of a specific group ID in the form of a nested table and using the multiset except operator to determine whether the elements in the group are the same:

 -- set-up SQL> create table tb_table( 2 id number, 3 itemsid number); Table created SQL> insert into tb_table(id, itemsid) 2 select 1, 2 from dual union all 3 select 1, 3 from dual union all 4 select 1, 4 from dual union all 5 select 2, 4 from dual union all 6 select 2, 3 from dual union all 7 select 2, 2 from dual union all 8 select 3, 2 from dual union all 9 select 3, 3 from dual union all 10 select 3, 6 from dual union all 11 select 3, 4 from dual union all 12 select 4, 1 from dual union all 13 select 4, 2 from dual union all 14 select 4, 3 from dual union all 15 select 5, 1 from dual union all 16 select 5, 2 from dual union all 17 select 5, 3 from dual union all 18 select 6, 2 from dual union all 19 select 6, 4 from dual union all 20 select 6, 3 from dual; 19 rows inserted SQL> commit; Commit complete SQL> create or replace type t_numbers as table of number; 2 / Type created -- contents of the table SQL> select * 2 from tb_table; ID ITEMSID ---------- ---------- 1 2 1 3 1 4 2 4 2 3 2 2 3 2 3 3 3 6 3 4 4 1 4 2 4 3 5 1 5 2 5 3 6 2 6 4 6 3 19 rows selected SQL> delete from tb_table 2 where id in (with DataGroups as( 3 select id 4 , grp 5 , (select count(*) from table(grp)) cnt 6 from (select id 7 , cast(collect(itemsid) as t_numbers) grp 8 from tb_table 9 group by id 10 ) 11 ) 12 select distinct id2 13 from ( select dg1.id as id1 14 , dg2.id as id2 15 , (dg1.grp multiset except dg2.grp) res 16 , dg1.cnt 17 from DataGroups Dg1 18 cross join DataGroups Dg2 19 where dg1.cnt = dg2.cnt 20 order by dg1.id 21 ) t 22 where res is empty 23 and id2 > id1 24 ) 25 ; 9 rows deleted SQL> select * 2 from tb_table; ID ITEMSID ---------- ---------- 1 2 1 3 1 4 3 2 3 3 3 6 3 4 4 1 4 2 4 3 10 rows selected 
+1
source

This is the best I can think of, but somehow I feel that there should be a simpler solution:

 delete from items where id in ( select id from ( with counts as ( select id, count(*) as cnt from items group by id ) select c1.id, row_number() over (order by c1.id) as rn from counts c1 join counts c2 on c1.id <> c2.id and c1.cnt = c2.cnt and not exists (select i1.itemsid from items i1 where i1.id = c1.id minus select i2.itemsid from items i2 where i2.id = c2.id) ) t where rn <> 1 ); 

It will work for any number of itemsid values.

rn <> 1 in combination with upward sorting in the window definition will contain the smallest identifier in the table (in your case 1 ). If you want to keep the highest value of the identifier, you need to change the sort order to over (order by c1.id desc)

0
source

Source: https://habr.com/ru/post/1446590/


All Articles