How to deal with quotes and apostrophes for comparing strings in MySQL so that they match (matching)

MySQL uses collation to compare strings, as some characters must match

Exemple:

SELECT 'é' = 'e' COLLATE utf8_unicode_ci;
SELECT 'oe' = 'œ' COLLATE utf8_unicode_ci; 

both return true

Now how can I do the same with quotation marks (') vs apostrophes ()

This is not the same character that should be used to spell "your" or "loiseau" (in French), and the apostrophe.

The fact is that neither utf8_general_ci nor utf8_unicode_ci group them.

A simple solution is to store everything in quotes and replace all apostrophes when the user searches, but not like that.

The real solution would be to create a custom sort based on utf8_unicode_ci and mark it as equivalent, but requiring editing XML configuration files and restarting the database, which is not always possible.

How do you do this?

+3
source share
1 answer

Custom matching seems to be the most suitable, but if that is not possible, you can possibly adapt your search queries to the use of regular expressions. This is not entirely ideal, but may be useful in some situations. At the very least, this allows you to store data in the correct format (without replacing quotes) and simply perform replacements in the search query itself:

INSERT INTO mytable VALUES
(1, 'Though this be madness, yet there is method in ''t'),
(2, 'Though this be madness, yet there is method in ’t'),
(3, 'There ’s daggers in men’s smiles'),
(4, 'There ’s daggers in men' smiles');

SELECT * FROM mytable WHERE data REGEXP 'There [\'’]+s daggers in men[\'’]+s smiles';

+----+--------------------------------------+
| id | data                                 |
+----+--------------------------------------+
|  3 | There ’s daggers in men’s smiles     |
|  4 | There ’s daggers in men smiles     |
+----+--------------------------------------+

SELECT * FROM mytable WHERE data REGEXP 'Though this be madness, yet there is method in [\'’]+t';

+----+-----------------------------------------------------+
| id | data                                                |
+----+-----------------------------------------------------+
|  1 | Though this be madness, yet there is method in 't   |
|  2 | Though this be madness, yet there is method in ’t   |
+----+-----------------------------------------------------+
+1

Source: https://habr.com/ru/post/1779114/


All Articles