Delete duplicate data and load into another table in SQL Server

Question

Delete duplicate data and load into another table in SQL Server

I have a question about SQL Server.

Table: emp

 empid | name |sal 1 | abc |100 2 | def |200 3 | test |300 2 | har |500 3 | jai |600 4 | kali |240

This table has duplicate data based on the table above. I want to remove duplicate data from emp table

And duplicate data should be loaded into the empduplicate table.

Here empid is unique. If empid displayed multiple times, then this entry is considered a duplicate.

Structure

empduplicate as follows:

 Empid | name | sal

Finally, after deleting the duplicate data, I want the data in the emp table to look like this:

 empid | name | sal 1 | abc | 100 4 | kali | 240

To remove duplicates, I tried this code:

 ;with duplicate as ( select *, row_number()over (partition by empid order by empid) as rn from emp ) delete from duplicate where rn > 1

But I can not delete entire entries.

Example: empid=2 has duplicate data

 empid|name |sal 2 |def |200 2 |har |500

I need to delete integer empid=2 matching entries. empid=2 has a duplicate and needs to be removed from the emp table.

And the empduplicate table should load duplicate data:

 empid | name |sal 2 |def |200 2 |har |500 3 |test |300 3 |jai |600

To enter duplicate data, I tried this code:

 insert into empduplicate select id, name, sal from emp group by id having count(*) > 1

This request causes an error:

The column "duplicate.name" is not valid in the select list because it is not contained in the aggregate function or in the GROUP BY clause.

Please tell me how to write a query to complete my task in SQL Server

+5

sql-server sql-server-2008

ravi Dec 23 '15 at 3:03

source share

3 answers

 BEGIN TRAN SELECT * INTO empduplicate FROM ( SELECT * FROM emp WHERE empid IN ( SELECT empid FROM emp GROUP BY empid HAVING COUNT(empid)>1 ) ) as M DELETE FROM emp WHERE empid IN ( SELECT empid FROM emp GROUP BY empid HAVING COUNT(empid)>1 ) COMMIT TRAN

0

Balajishriram Dec 23 '15 at 9:21

source share

 SELECT DISTINCT * INTO #tmp FROM emp DELETE FROM emp INSERT INTO emp SELECT * FROM #tmp DROP table #tmp SELECT * FROM emp ---------------------------- All Distinct ID SELECT * INTO #tmp FROM emp WHERE empid in( SELECT empid FROM emp group by empid having count(*) = 1 ) DELETE FROM emp INSERT INTO emp SELECT * FROM #tmp DROP table #tmp SELECT * FROM emp ----------------------------All ID which is not duplicate INSERT INTO empduplicate SELECT * FROM emp where empid in( SELECT empid FROM emp group by empid having count(*) >1 ) SELECT * FROM empduplicate -------------------ALL Duplicate value.

0

Amee Dec 23 '15 at 10:56

source share

Felix pamittan · Accepted Answer · 2015-12-23T03:10:04+0000

You are almost there. Instead of ROW_NUMBER use COUNT :

 WITH CteInsert AS( SELECT *, cnt = COUNT(empid) OVER(PARTITION BY empid) FROM emp ) INSERT INTO empduplicate(empid, name, sal) SELECT empid, name, sal FROM CteInsert WHERE cnt > 1; WITH CteDelete AS( SELECT *, cnt = COUNT(empid) OVER(PARTITION BY empid) FROM emp ) DELETE FROM CteDelete WHERE cnt > 1;

You need to do an INSERT first before DELETE . In addition, you can include this in one transaction.

Delete duplicate data and load into another table in SQL Server

More articles: