MySQL: finding duplicate names in my user table

I want to find all users whose name appears at least twice in my User table. "email" is a unique field, but the combination of "firstName" and "lastName" is not necessarily unique.

So far, I have come up with the following query, which is very slow, and I'm not even sure if it is correct. Please let me know how best to rewrite this.

SELECT CONCAT(u2.firstName, u2.lastName) AS fullName FROM cpnc_User u2 WHERE CONCAT(u2.firstName, u2.lastName) IN ( SELECT CONCAT(u2.firstName, u2.lastName) AS fullNm FROM cpnc_User u1 GROUP BY fullNm HAVING COUNT(*) > 1 ) 

Also note that the above returns a list of names that appear at least twice (I think so, one way or another), but I really want this to be a complete list of all user ID fields for these names. Therefore, each name, since it appears at least twice, will be associated with at least two fields of the "id" primary key.

Thanks for any help! Jonah

+4
source share
5 answers
 SELECT u.* FROM cpnc_User u JOIN ( SELECT firstName, lastName FROM cpnc_User GROUP BY firstName, lastName HAVING COUNT(*) > 1 ) X on X.firstName = u.firstName AND x.lastName = u.lastName ORDER BY u.firstName, u.lastName 

No need to make a concatenated field, just use 2 fields separately

+7
source
 SELECT u.id, u.firstName, u.lastName FROM cpnc_User u, ( SELECT uc.firstName, uc.lastName FROM cpnc_User uc GROUP BY uc.firstName, uc.lastName HAVING count(*) > 1 ) u2 WHERE ( u.firstName = u2.firstName AND u.lastName = u2.lastName ) 
+3
source
 SELECT u.id , CONCAT(u.firstName, ' ', u.lastName) AS fullname FROM cpnc_User u JOIN ( SELECT min(id) AS minid , firstName , lastName FROM cpnc_User GROUP BY firstName, lastName HAVING COUNT(*) > 1 ) AS grp ON u.firstName = grp.firstName AND u.lastName = grp.lastName ORDER BY grp.minid , u.id 

ORDER BY grp.minid ensures that users with the same first and last names remain grouped together in the output.

+2
source

For the experiment, I created a simple table with two columns, user ID and name. I inserted a bunch of records, including some duplicates. Then executed this request:

 SELECT count(id) AS count, group_concat(id) as IDs FROM test GROUP BY `name` ORDER BY count DESC 

It should give you the following results:

 +-------+----------+ | count | IDs | +-------+----------+ | 4 | 7,15,4,1 | | 2 | 2,8 | | 2 | 6,13 | | 2 | 14,9 | | 1 | 11 | | 1 | 10 | | 1 | 3 | | 1 | 5 | | 1 | 17 | | 1 | 12 | | 1 | 16 | +-------+----------+ 

You will need to filter out later results using something else.

+2
source

OK, you do concatenation and then do a comparison on this, which essentially means that the database will have to do something with each individual row of the database.

How about a slightly different approach, you keep the last name and first name separately. Therefore, first select all instances where the last name appears> 1 time in your database. Now it has drastically reduced your population.

Now you can do a comparison by first name to find out where they match.

+1
source

Source: https://habr.com/ru/post/1346471/


All Articles