Covert MySQL Text Search

We have a simple search on our site that uses MySQL full-text search, and for some reason, it does not seem to return the correct results. I donโ€™t know if there is any problem with Amazon RDS (where our database server is located) or with the query we are requesting.

Here is the structure of the database table:

CREATE TABLE `items` ( `object_id` int(9) unsigned NOT NULL DEFAULT '0', `slug` varchar(100) DEFAULT NULL, `name` varchar(100) DEFAULT NULL, PRIMARY KEY (`object_id`), FULLTEXT KEY `name` (`name`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1; 

And here is a simple full-text search query in this table and the returned results:

 select object_id ,slug,name from items where MATCH (name) AGAINST ('+ski*' IN BOOLEAN MODE) order by name; +-----------+-----------------------------------+------------------+ | object_id | slug | name | +-----------+-----------------------------------+------------------+ | 10146041 | us/new-hampshire/dartmouth-skiway | Dartmouth Skiway | +-----------+-----------------------------------+------------------+ 

If I use LIKE, I get a different set of results:

 select object_id,slug,name from items where name LIKE "%ski%" order by name; +-----------+------------------------------------------+----------------------------------+ | object_id | slug | name | +-----------+------------------------------------------+----------------------------------+ | 10146546 | us/new-york/brantling-ski | Brantling Ski | | 10146548 | us/new-york/buffalo-ski-club | Buffalo Ski Club | | 10146041 | us/new-hampshire/dartmouth-skiway | Dartmouth Skiway | | 10146352 | us/montana/discover-ski | Discover Ski | | 10144882 | us/california/donner-ski-ranch | Donner Ski Ranch | | 10146970 | us/new-york/hickory-ski-center | Hickory Ski Center | | 10146973 | us/new-york/holimont-ski-area | Holimont Ski Area | | 10146283 | us/minnesota/hyland-ski | Hyland Ski | | 10145911 | us/nevada/las-vegas-ski-snowboard-resort | Las Vegas Ski & Snowboard Resort | | 10146977 | us/new-york/maple-ski-ridge | Maple Ski Ridge | | 10146774 | us/oregon/mount-hood-ski-bowl | Mt. Hood Ski Bowl | | 10145949 | us/new-mexico/sipapu-ski | Sipapu Ski | | 10145952 | us/new-mexico/ski-apache | Ski Apache | | 10146584 | us/north-carolina/ski-beech | Ski Beech | | 10147973 | canada/quebec/ski-bromont | Ski Bromont | | 10146106 | us/michigan/ski-brule | Ski Brule | | 10145597 | us/massachusetts/ski-butternut | Ski Butternut | | 10145117 | us/colorado/ski-cooper | Ski Cooper | | 10146917 | us/pennsylvania/ski-denton | Ski Denton | | 10145954 | us/new-mexico/ski-santa-fe | Ski Santa Fe | | 10146918 | us/pennsylvania/ski-sawmill | Ski Sawmill | | 10145299 | us/illinois/ski-snowstar | Ski Snowstar | | 10145138 | us/connecticut/ski-sundown | Ski Sundown | | 10145598 | us/massachusetts/ski-ward | Ski Ward | +-----------+------------------------------------------+----------------------------------+ 

I have a complete loss why a query using full-text search does not work. I hope some MySQL experts can point out an error in our query.

Thanks in advance for your help!

+4
source share
2 answers

From MySQL Docs

  • + A leading plus sign indicates that this word should be present in each returned line.

  • * asterisk serves as a truncation (or wildcard). Unlike other operators, it must join the word, which should be affected. Words match if they begin with the word preceding * Operator.

    If a word is specified using the truncation operator, it is not without a logical query, even if it is too short (as determined by the value of ft_min_word_len) or a stopwatch. This is because the word is not considered too short or a stopwatch, but as a prefix that must be present in the document in the form of a word starting with the Prefix .

In the context:

MATCH (...) AGAINST (...)

MATCH (name) AGAINST ('+ski*' IN BOOLEAN MODE) means that you are looking for rows in which the word in the name column should contain ski and should start with the word ski .

From the set that you sent, Dartmouth Skiway is the only name that meets these requirements: it contains the word ski and has the prefix of the word ski .

Other name columns, although they comply with the first rule: must contain ski , they do not have the ski prefix, as indicated in your rule. The string returned by your logical search is the only one with a name column that contains a word containing ski and a word prefixed with ski .

As suggested by ajreal, try decreasing ft_min_len_word_setting in my.cnf . Your search may not produce the expected results due to the default setting. Try reducing it to 3.

WHERE column LIKE% text%

WHERE name LIKE "%ski%" searches for rows with name columns that contain ski , regardless of where the word occurs.

+5
source

The minimum and maximum lengths of words to be indexed are determined by the system variables ft_min_word_len and ft_max_word_len. (See Section 5.1.4 โ€œServer System Variables.โ€) The minimum default value is four characters; the default maximum size is version dependent. If you change the value, you must rebuild your FULLTEXT indices. For example, if you want three-character words to be searchable, you can set the ft_min_word_len variable by placing the following lines in the parameter file:

resource - http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html

configurations:

  [mysqld]
 ft_min_word_len = 3 
+1
source

Source: https://habr.com/ru/post/1337623/


All Articles