Sql query to display only duplicate records based on two columns

I have a table called employee , there are a lot of records in this table. Here are some sample data:

fullname  | address  |  city
-----------------------------
AA          address1    City1
AA          address3    City1
AA          address8    City2
BB          address5    City2
BB          address2    City1
CC          address6    City1
CC          address7    City2
DD          address4    City1

I want to have a query SELECTin sql server that will show only duplicate records based on columns fullnameand city. For data data and given the condition, only the first two records are duplicated. Thus, the expected result should look like this:

fullname  | address  |  city
-----------------------------
AA          address1    City1
AA          address3    City1

To get this conclusion, I wrote this query:

select fullname, city from employee group by fullname, city having count(*)>1

As you can see, it selects only two columns and thus gives the following output:

fullname  | city
------------------
AA          City1

If I rewrote the request as shown below:

select fullname, city, address from employee group by fullname, city, address 
having count(*)>1

, ! - , ?

+4
5

COUNT

SELECT fullname,
       address,
       city
FROM   (SELECT *,
               COUNT(*) OVER (PARTITION BY fullname, city) AS cnt
        FROM   employee) e
WHERE  cnt > 1 
+3

. Windows, , , ,

   Select employee.* from employee 
   join (select fullname, city from employee group by fullname, city having count(*)>1) q1 
   on q1.fullname = employee.fullname and q1.city = employee.city 
+1

:

      create table ##Employee
      (Fullname varchar(25),
       Address varchar(25),
       City varchar(25))

      insert into ##Employee values
     (    'AA',          'address1',    'City1')
    ,(    'AA',          'address3',    'City1')
    ,(    'AA',          'address8',    'City2')
    ,(    'BB',          'address5',    'City2')
    ,(    'BB',          'address2',    'City1')
    ,(    'CC',          'address6',    'City1')
    ,(    'CC',          'address7',    'City2')


      select E.* from ##Employee E
      cross apply(
      select Fullname,City,count(Fullname) cnt from ##Employee
      group by Fullname,City
      having Count(Fullname)>1)x
      where E.Fullname=x.Fullname
      and E.City=x.City
+1

, :

select e.*
from employee e
where exists (select 1
              from employee e2
              where e2.fullname = e.fullname and e2.city = e.city and
                    e2.address <> e.address  -- or id or some other unique column
             );

Although I would probably go with an approach to the window function, you may find that in some circumstances this happens faster (especially if you have an index on employee(fullname, city, address)).

+1
source

Here you will find a solution:

DECLARE  @Employee TABLE
        (
            Fullname VARCHAR(25),
            [Address] VARCHAR(25),
            City VARCHAR(25)
        )

      INSERT INTO @Employee VALUES
      ('AA', 'address1', 'City1') 
      ,('AA', 'address1', 'City1') 
      ,('AA', 'address3', 'City1')
      ,('AA', 'address8', 'City2')
      ,('BB', 'address5', 'City2')
      ,('BB', 'address2', 'City1')
      ,('CC', 'address6', 'City1')
      ,('CC', 'address7', 'City2')

     ;WITH cte AS (
               SELECT *,
                      ROW_NUMBER() OVER(PARTITION BY FullName, [Address], [City] ORDER BY Fullname) AS sl,
                      HashBytes('MD5', FullName + [Address] + [City]) AS RecordId
               FROM   @Employee AS e
           )

      SELECT c.FullName,
             c.[Address],
             c.City
      FROM   cte             AS c
             INNER JOIN cte  AS c1
                  ON  c.RecordId = c1.RecordId
      WHERE  c.sl = 2

Result:

FullName    Address     City
AA          address1    City1
AA          address1    City1
0
source

Source: https://habr.com/ru/post/1687521/


All Articles