Delete Duplicate Rows # 2

I have a table (large ~ 1,000,000 rows) that potentially contains duplicate rows (possible NULL values).

I want to do the following:

  • Select only row lines.
  • Delete lines with duplicate id fields.

Let there be a table:

id | a | b    
1  | 2 | 3    
2  | 8 | 7    
3  | 9 | 10    
2  | 8 | 7    
3  | 20| 12

I want to receive:

id | a | b    
1  | 2 | 3    
2  | 8 | 7

The line with id 2 is saved in one copy, and the lines with id 3 are deleted.

I'm thinking of:

  • SELECT DISTINCT id, a, b FROM table; to get only individual rows.
  • Somehow filter the result (1) to remove duplicate identifiers.

What would be the best way to approach this?

+3
source share
3 answers

Peter, similar to the comments you want COMBINATION ...

:   ,   , , -

:   , - .

select ID, min(a) a, min(b) b
    from YourTable
    group by ID
    having min(a) = max(a)
       and min(b) = max(b)

, a b , . , ,

ID  MIN(A)  MIN(B)    Having MIN(A)  MAX(A)  MIN(B)  MAX(B)
1    2        3                2        2       3      3 
2    8        7                8        8       7      7
3    9       10                9       20      10     12    

, ID = 3 , min() max() BOTH. . ...

0

, :

SELECT id, min(a) as a, min(b) as b
FROM (SELECT DISTINCT id, a, b FROM table) t
GROUP BY id
HAVING count(*) =1
+2

, , , ? SQL .

0

Source: https://habr.com/ru/post/1788180/


All Articles