Mysql boolean mode full-text search with wildcards and literals

I am new to MySQL full-text search, and today I ran into this problem:

In the table "My company" there is an entry with "e-magazine AG" in the column "Name". I have a full-text index in the name column.

When I execute this query, the record was not found:

SELECT id, name FROM company WHERE MATCH(name) AGAINST('+"e-magazi"*' IN BOOLEAN MODE); 

I need to work with quotes because of dashes and use a wildcard because I implement the "search as you type" functionality.

When I search for the entire term "AG electronic journal", the entry is found.

Any ideas what I'm doing wrong here? I read about adding a dash to the word character list (configuration update required), but I'm looking for a way to do this programmatically.

+6
source share
2 answers

This offer

 MATCH(name) AGAINST('+"e-magazi"*' IN BOOLEAN MODE); 

Will search AND "e" AND NOT "magazi" ; that is - inside the "e-magazi" will be interpreted as not , even if it is inside the quotation marks.
For this reason, it will not work as expected.
The solution is to use the optional having with LIKE.

I know that having slow, but it will only apply to the results of the match, so not too many lines should be involved.

I suggest something like:

 SELECT id, name FROM company WHERE MATCH(name) AGAINST('magazine' IN BOOLEAN MODE) HAVING name LIKE '%e-magazi%'; 
+2
source

MySQL fulltext treats the word e-magazine in a text as a phrase and not as a word. Because of this, the words two e and magazine appear. And although it builds a search index, it does not add e to the index due to ft_min_word_len (4 characters by default).

The same length limit is used for the search query. That is why an e-magazine search returns exactly the same results as a-magazine , since a and - completely ignored.

But now you want to find the exact phrase e-magazine . Thus, you use quotation marks, and this is the complete correct way to find phrases, but MySQL does not support operators for phrases, only for words:
https://dev.mysql.com/doc/refman/5.7/en/fulltext-boolean.html

Using this modifier, certain characters have a special meaning at the beginning or end of a word in the search bar

Some people would suggest using the following query:

 SELECT id, name FROM company WHERE MATCH(name) AGAINST('e-magazi*' IN BOOLEAN MODE) HAVING name LIKE 'e-magazi%'; 

As I said, MySQL ignores e- and looks for the wildcard magazi* . After receiving these results, he uses HAVING to additionally filter the results for e-magazi* , including e- . Thus you will find the phrase e-magazine AG . Of course, HAVING is only required if the search phrase contains a wildcard operator, and you should never use quotation marks. This operator is used by your user, not you!

Note. Until you surround the search phrase with % , it will only find fields starting with that word. And you do not want to surround him, because he will find a bee-magazine . Therefore, you may need the additional OR HAVING name LIKE ' %e-magazi%' OR HAVING NAME LIKE '\\n%e-magazi%' to make it suitable for use within texts.

Trick

But finally, I prefer the trick, so HAVING is not needed at all:

  • If you add texts to your database table, add them additionally to a separate footer with full text and replace words of the type up-to-date with up-to-date uptodate .
  • If the user is looking for up-to-date , replace it in the request with uptodate .

Thus, you can still find specific in user-specific , but up-to-date (and not only date ).

Bonus

If a user searches for -well-known huge ports , MySQL treats this as not include *well*, could include *known* and *huge* . Of course, you could solve this with another variant of the additional query, but with the trick above you remove the hyphen, so the search query looks just like this:

 SELECT id FROM texts WHERE MATCH(text) AGAINST('-wellknown huge ports' IN BOOLEAN MODE) 
0
source

Source: https://habr.com/ru/post/895407/


All Articles