Find duplicate records in MySQL using LIKE

I would like to find all duplicate entries by name in the client table using MySQL, including those that do not match exactly.

I know I can use the request

SELECT id, name FROM customer GROUP BY name HAVING count(*) > 1; 

to find all rows that match exactly, but I want to find all duplicate rows matching the LIKE . For example, there may be a client named "Mark Widgets" and another "Mark Widgets Inc.", I would like my query to find them as duplicates. So, something like

 SELECT id, name AS name1 ... WHERE name1 LIKE CONCAT("%", name2, "%") ... 

I know what is completely wrong, but what an idea. Here is a capable diagram:

 mysql> describe customer; +-----------------------------+--------------+------+-----+------------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------------------------+--------------+------+-----+------------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | name | varchar(140) | NO | | NULL | | ... 

EDIT: To clarify, I want to find all duplicates, not just duplicates of one specific customer name.

+4
source share
5 answers

It is quite possible to do this, but before you start, you need to determine your rules regarding what is a coincidence and what is not, without this you cannot go anywhere.

You could, for example, ignore the first and last 3 characters of the name and the coincidence of the middle characters, or you could choose a more complex logic, but there is no magic method to achieve what you want, you have to code the logic. Whatever your choice, it must be determined before you begin, and before we can really help.

There is no mysql, so excuse my syntax errors (its t-sql syntax, if any), but I'm thinking of self-connecting

 SELECT t1.ID FROM MyTable t1 LEFT OUTER JOIN MyTable t2 ON t1.name LIKE CONCAT('%', t2.name, '%') group by t1.ID HAVING count(*) > 1 
+3
source

I think this will work, but in my experience, having functions inside ONs requires a ridiculous amount of time to process, especially when combined with the LIKE statement. However, he is a little better than a cross.

 SELECT cust1.id, cust1.name FROM customer AS cust1 INNER JOIN customer AS cust2 ON (cust1.name LIKE (CONCAT('%',CONCAT(cust2.name,'%')))) GROUP BY cust1.id, cust1.name HAVING count(*) > 1 
0
source

How about this one. You can substitute the name a.name = b.name if that matters.

 Select a.id, b.id from customer a, customer b where a.name = b.name and a.id != b.id; 
0
source

My answer will be ...

 SELECT A . * FROM customer AS A, customer AS B WHERE A.name LIKE CONCAT( '%', B.name, '%' ) AND A.name = B.name GROUP BY A.id HAVING COUNT( * ) >1 
0
source
 SELECT * FROM customer WHERE name LIKE "%Mark Widgets%"; 

http://www.mysqltutorial.org/sql-like-mysql.aspx should also help with the LIKE team.

Not sure why you need to use the CONCAT section, so that might be too easy.

-1
source

Source: https://habr.com/ru/post/1301162/


All Articles