String Pattern Matching for Limited, Ltd, Incorporated, Inc, etc.

We are doing a lot of work to try to reconcile about 1,000 duplicate manufacturer names and 1,000,000 duplicate part numbers. One thing that has arisen is how to “match” things like “Limited” and “Ltd.”, vs. "Ltd"

The goal is for the application to harmonize these consistent elements in a standard format. So:

ACME Ltd. ACME Limited ACME Ltd

All must be agreed with ACME Ltd.

It will also be used to prevent the introduction of additional duplicates in the future.

Any suggestions for doing this pattern matching in SQL Server? Any known algorithms for finding items with displayed equivalents, etc ??

Thank!

Eric

+3
2

, , , ?

Ltd   Limited 
Ltd   Ltd.
St    Street
St    Str.

, , . , .

+3

SQL Server, :

SQL , (a ).

:

 <expansion>
         <sub>Limited</sub>
         <sub>Ltd</sub>
         <sub>Ltd.</sub>
 </expansion>

, , . , ...

SQL Server LIKE. , , , .

LIKE , CLR UDF, . ...

+2

Source: https://habr.com/ru/post/1786541/


All Articles