SQL - select only a row that is not duplicated.

I need to transfer data from one table to another. The second table received the primary key constraint (and the first has no constraints). They have the same structure. I want to select all rows from table A and insert it into table B without a duplicated row (if row is0 is duplicated, I only want to take the first one found)

Example:

MyField1 (PK)   |   MyField2 (PK)   |   MyField3(PK)   |   MyField4   |   MyField5  

----------

1               |   'Test'          |   'A1'           |   'Data1'    |   'Data1'  
2               |   'Test1'         |   'A2'           |   'Data2'    |   'Data2'  
2               |   'Test1'         |   'A2'           |   'Data3'    |   'Data3'  
4               |   'Test2'         |   'A3'           |   'Data4'    |   'Data4'

As you can see, the second and third lines received the same pk key, but different data in MyField4 and MyField5. So, in this example, I would like to have the first, second, and fourth lines. Not the third, because it is a duplication of the second (even if MyField4 and MyField5 contain different data).

How can I do this with one choice?

THX

+3
5

-, , "". , SQL , , . , "" MyField4, , MyField5. 5 .

SELECT DISTINCT
     T1.MyField1,
     T1.MyField2,
     T1.MyField3,
     T1.MyField4,
     T1.MyField5
FROM
     MyTable T1
LEFT OUTER JOIN MyTable T2 ON
     T2.MyField1 = T1.MyField1 AND
     T2.MyField2 = T1.MyField2 AND
     T2.MyField3 = T1.MyField3 AND
     (
          T2.MyField4 > T1.MyField4 OR
          (
               T2.MyField4 = T1.MyField4 AND
               T2.MyField5 > T1.MyField5
          )
     )
WHERE
     T2.MyField1 IS NULL

, , , .

+4

, , 2 3 , mysql :

insert ignore into new_table (select * from old_table);

PK .

+3

? Oracle

SELECT FROM your_table
WHERE rowid in
(SELECT MIN(rowid)
 FROM your_table
 GROUP BY MyField1, MyField2, MyField3);

, , PK "". , .

+2
source

It depends on what you are looking for.

There is a big difference between using JOIN+ WHERE NULL, NOT INand NOT EXISTSincluding performance, which is more important when using large datasets.

(See NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL .)

The three methods shown in the related article are quite simple.

+1
source
CREATE TABLE #A(
ID INTEGER IDENTITY,
[MyField1] [int] NULL,
[MyField2] [varchar](10) NULL,
[MyField3] [varchar](10) NULL,
[MyField4] [varchar](10) NULL,
[MyField5] [varchar](10) NULL
) 

INSERT INTO #A (MyField1,MyField2,MyField3,MyField4,MyField5) SELECT * FROM A

insert into B 
   select MyField1,MyField2,MyField3,MyField4,MyField5 from #A a1 
    where not exists (select id from #A a2 where a2.MyField1 = a1.MyField1 and a2.ID < a1.ID)

DROP TABLE #A

OR

insert into b
  select distinct * from a a1 
    where not exists (
  select a2.MyField1 from a a2 where a1.MyField1 = a2.MyField1 and 
       (a1.MyField2 < a2.MyField2 or a1.MyField3 < a2.MyField3 
        or a1.MyField4 < a2.MyField5 or a1.MyField5 < a2.MyField5))
0
source

Source: https://habr.com/ru/post/1702946/


All Articles