Reflecting sphinx fuzzy search?

I am implementing sphinx search in my rails application.
I want to search with fuzzy. It should look for spelling errors, for example, if it enters the search query charact * a * ristics, it should look for char * attributes. *

How do I implement this

+6
source share
3 answers

Sphinx, of course, does not allow spelling errors - it does not matter to it whether the words are spelled correctly or not, they simply index them and compare them.

There are two options: either use thinking-sphinx-raspell to catch spelling errors of users when they search and offer them the choice to search again with an improved query (as Google does); or perhaps use the morphology of soundex or metaphones, so the words are indexed in such a way that takes into account how they sound. Find this page to exit, you will find the corresponding section. Also read the Sphinx documentation on this subject.

I have no idea how reliable any option will be - personally, I would choose # 1.

+6
source

By default, Sphinx does not pay attention to finding wildcards using an asterisk. You can enable it though:

 development: enable_star: true # ... repeat for other environments 

See the http://pat.imtqy.com/thinking-sphinx/advanced_config.html Syntax for Wildcards / Stars section .

+3
source

Yes, Sphinx generaly always uses advanced matching modes.

The following combinations are available:

 SPH_MATCH_ALL, matches all query words (default mode); SPH_MATCH_ANY, matches any of the query words; SPH_MATCH_PHRASE, matches query as a phrase, requiring perfect match; SPH_MATCH_BOOLEAN, matches query as a boolean expression (see Section 5.2, "Boolean query syntax"); SPH_MATCH_EXTENDED, matches query as an expression in Sphinx internal query language (see Section 5.3, "Extended query syntax"); SPH_MATCH_EXTENDED2, an alias for SPH_MATCH_EXTENDED; SPH_MATCH_FULLSCAN, matches query, forcibly using the "full scan" mode as below. NB, any query terms will be ignored, such that filters, filter-ranges and grouping will still be applied, but no text-matching. 

SPH_MATCH_EXTENDED2 was used during the 0.9.8 and 0.9.9 development cycle when the internal matching mechanism was rewritten (for the sake of additional functionality and better performance). In version 0.9.9, the older version was removed, and SPH_MATCH_EXTENDED and SPH_MATCH_EXTENDED2 are now just aliases.

enable_star

Includes asterisk syntax (or wildcard syntax) when searching by prefix / infix index. > Optional, defaults to 0 (do not use wildcard syntax), for compatibility with 0.9.7. > Known values โ€‹โ€‹are 0 and 1.

For example, suppose an index was created with infixes and that enable_star is 1. The search should work as follows:

 "abcdef" query will match only those documents that contain the exact "abcdef" word in them. "abc*" query will match those documents that contain any words starting with "abc" (including the documents which contain the exact "abc" word only); "*cde*" query will match those documents that contain any words which have "cde" characters in any part of the word (including the documents which contain the exact "cde" word only). "*def" query will match those documents that contain any words ending with "def" (including the documents that contain the exact "def" word only). 

Example:

enable_star = 1

+2
source

Source: https://habr.com/ru/post/888519/


All Articles