Recheck SQL Server

Question

Recheck SQL Server

What is the best way to identify duplicate records in a SQL Server table?

For example, I want to find the last duplicate email received in the table (the table shows the fields of the primary key, received dates and email).

Sample data:

1 01/01/2008 stuff@stuff.com 2 02/01/2008 stuff@stuff.com 3 01/12/2008 noone@stuff.com

+4

sql sql-server

doanair Sep 05 '08 at 20:36

source share

7 answers

SQLMenace · Answer 1 · 2008-09-05T20:38:39+0000

something like that

 select email ,max(receiveddate) as MaxDate from YourTable group by email having count(email) > 1

user2051770 · Answer 2 · 2013-02-07T17:33:56+0000

Try something like:

 SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (PARTITION BY ReceivedDate, Email ORDER BY ReceivedDate, Email DESC) AS RowNumber FROM EmailTable ) a WHERE RowNumber = 1

See http://www.technicaloverload.com/working-with-duplicates-in-sql-server/

Rob Rolnick · Answer 3 · 2008-09-05T20:38:01+0000

Could you join the list in the email field and then see which zeros you get as a result?

Or is it even better to count instances of each email address? And return only those with count> 1

Or even take the email and id fields. And return the entries where the email is the same and the identifiers are different. (To avoid duplicates, do not use! =, But rather either <or>.)

Michael sharek · Answer 4 · 2008-09-05T20:38:42+0000

try it

 select * from table a, table b where a.email = b.email

palehorse · Answer 5 · 2008-09-05T20:40:10+0000

 SELECT [id], [receivedate], [email] FROM [mytable] WHERE [email] IN ( SELECT [email] FROM [myTable] GROUP BY [email] HAVING COUNT([email]) > 1 )

enigmatic · Answer 6 · 2008-09-05T20:42:57+0000

Do you need a list of recent items? If so, you can use:

 SELECT [info] FROM [table] t WHERE NOT EXISTS (SELECT * FROM [table] tCheck WHERE t.date > tCheck.date)

If you want the list of all duplicated email addresses to use GROUP BY to collect similar data, then the HAVING clause to make sure the number is greater than 1:

 SELECT [info] FROM [table] GROUP BY [email] HAVING Count(*) > 1 DESC

If you want to receive the last duplicate email (the only result), you simply add "TOP 1" and "ORDER BY":

 SELECT TOP 1 [info] FROM [table] GROUP BY [email] HAVING Count(*) > 1 ORDER BY Date DESC

Brian hart · Answer 7 · 2008-09-05T20:47:56+0000

If you have a surrogate key, it is relatively easy to use the group according to the syntax specified in the SQLMenance message. Essentially, group all the fields that make two or more lines “the same”.

An example of pseudo code to remove duplicate entries.

 Create table people (ID(PK), Name, Address, DOB) Delete from people where id not in ( Select min(ID) from people group by name, address, dob )

Recheck SQL Server

More articles: