Optimizing Soundex Query to Find Similar Names

My application will offer a list of sentences for English names that “sound like” a given typed name.

The query will need to be optimized and the results returned as quickly as possible. Which option will be most optimal for quick return of results. (Or your own suggestion, if you have one)

and. Create a Soundex Hash and save it in the "Names" table, then do something like the following: (Does it save generating a soundex hash for at least every line in my db for every request?)

select a name from the names where NameSoundex = Soundex ('Ann')

C. Use the Difference function (Should this generate a soundex for each name in the table?)

select a name from the names where Difference (name, 'Ann')> = 3

C. Simple comparison

select a name from the names where Soundex (name) = Soundex ('Ann')

  • Option A seems to me the fastest to return results, because it only generates Soundex for one row, and then compares with the index column "NameSoundex"

  • Option B should give more results than option A, because the name does not have to be an exact soundex match, but it can be slower

  • Assuming my table can contain millions of rows, what will produce the best results?

+3
source share
1 answer

you can pre-calculate the DIFFERENCE () of all your names and save them in a table, for example:

Differences
Name1
Name2
Difference


INSERT INTO Differences
        (Name1,Name2,Difference)
    SELECT
        n1.Name,n2.Name,DIFFERENCE(n1.Name,n2.Name)
        FROM Names           n1
            CROSS JOIN Names n2
        WHERE DIFFERENCE(n1.Name,n2.Name)<??? --to put a cap on what to store

, . , , A B. "" . Zero A, B, Differences, WHERE DIFFERENCE(@givenName,Names.Name)<@UserSelectLevel

0

Source: https://habr.com/ru/post/1741216/


All Articles