MySQL LIKE% string% is not forgiving enough. Anything else I can use?

I have a client who asks if there is a search for them that searches for company names that can be searched in several formats depending on user input, for example, a company stored in a database, for example, AJR Kelly Ltd, if the user searches for "AJR Kelly, "he found using;

<cfif pctermsCount gt 0> AND (LOWER(p.name) LIKE '%#pcTerms#%') </cfif> 

If they search for "Kelly", the company is found, but if they look for a broken version of a string like "AJ Kelly" or "AJ Kelly", it was not found.

Is there anything I can do to make it a little more forgiving?

Thanks.

+6
source share
4 answers

If you use MyISAM, you can use full-text indexing. See this tutorial

If you use a different storage engine, you can use a third-party full-text engine such as sphinx, which can act as a storage engine for mysql or a separate server that can be requested.

With full MySQL text indexing, searching in AJ Kelly will match AJ Kelly (don't confuse the questions, but A, J and AJ will be ignored as they are too short by default and this will match Kelly.) Usually Fulltext is much more forgiving (and usually faster than LIKE '% string%'), because it allows partial matches, which can then be ranked by relevance.

You can also use SOUNDEX to make the search more forgiving by indexing the phonetic equivalents of words and looking at them, applying SOUNDEX to your search terms and then using them to find the index. With soundex, mary , marie and marry will match, for example.

+8
source

If you really use ColdFusion, you have access to full-text CF indexing using Verity or Solr / Lucene . Any of these should give you a good โ€œfuzzy matchโ€ feature for strings.

Using MyISAM Tables is a bitter tablet designed only for full-text indexing - you give up a lot of peace of mind and things like foreign key restrictions.

+4
source

You can create a new column and make it a search version of the name by removing the space, and then set the column type to FULLTEXT (will only work with MyISAM). You can also look at Lucene / SOLR. SOLR provides a number of tokenizers that work very well in a similar situation. The learning curve is pretty high, but worth it in the long run.

+2
source

Difficult, I assume that a simple method is to remove spaces when searching the database, so AJRKelly is used instead of AJR Kelly. Then use spaces as a separator for individual search terms. Thus, J. Kelly will search for A, J, and Kelly separately. AJ Kelly will search for AJ and Kelly separately. They will match the space of the AJRKelly database.

+1
source

Source: https://habr.com/ru/post/900344/


All Articles