Search for the Korean alphabet index (any Unicode char) in Korean Word (any Unicode Word) in SQL Server

I have a requirement to search for people by name. Here, the names of people can be in English, Korean or Chinese. To do this, I used the Like condition to search based on Name , as shown below:

 select * from [MyTable] where Name like N'%t%' 

The above operator provides all users that contain the letter t . But this does not work with Korean or Chinese. For example, if I search with the Korean letter γ…ˆ , then it must provide all the names containing this letter, like **μ •μˆ˜μ—°, μž¬ν›ˆμ•„μ΄νŒŸ, μ •μ›ν˜ ν…ŒμŠ€νŠΈ 7** . I tried the following methods, but it gave zero results

 select * from [MyTable] where Name like N'%γ…ˆ%' - No Results select PATINDEX(N'%γ…ˆ%',N'μ •μˆ˜μ—°(Mohan)') - giving value as ZERO select Charindex(N'γ…ˆ',N'μ •μˆ˜μ—°') - giving value as ZERO 

Is there a way to find the alphabets of other languages ​​on a SQL server?

I know how to find the existence of an alphabet in another language in C # words using coding methods, but not on a SQL server. Please help me in this regard.

Thanks in advance.

EDIT for C # code

 public static string DecomposeSyllabels(string unicodeString) { try { //Consonant consonant only used string[] JLT = { "γ„±", "γ„²", "γ„΄", "γ„·", "γ„Έ", "γ„Ή", "ㅁ", "γ…‚", "γ…ƒ", "γ……", "γ…†", "γ…‡", "γ…ˆ", "γ…‰", "γ…Š", "γ…‹", "γ…Œ", "ㅍ", "γ…Ž" }; // Only used a collection of neutral string[] JVT = { "ㅏ", "ㅐ", "γ…‘", "γ…’", "γ…“", "γ…”", "γ…•", "γ…–", "γ…—", "γ…˜", "γ…™", "γ…š", "γ…›", "γ…œ", "ㅝ", "γ…ž", "γ…Ÿ", "γ… ", "γ…‘", "γ…’", "γ…£" }; // Initial and coda consonants used in string[] JTT = { "", "γ„±", "γ„²", "γ„³", "γ„΄", "γ„΅", "γ„Ά", "γ„·", "γ„Ή", "γ„Ί", "γ„»", "γ„Ό", "γ„½", "γ„Ύ", "γ„Ώ", "γ…€", "ㅁ", "γ…‚", "γ…„", "γ……", "γ…†", "γ…‡", "γ…ˆ", "γ…Š", "γ…‹", "γ…Œ", "ㅍ", "γ…Ž" }; double SBase = 0xAC00; long SCount = 11172; int TCount = 28; int NCount = 588; string syllables = string.Empty; foreach (char c in unicodeString) { double SIndex = (int)c - SBase; if (0 > SIndex || SIndex >= SCount) { syllables = syllables + c; continue; } int LIndex = (int)Math.Floor(SIndex / NCount); int VIndex = (int)(Math.Floor((SIndex % NCount) / TCount)); int TIndex = (int)(SIndex % TCount); syllables = syllables + (JLT[LIndex] + JVT[VIndex] + JTT[TIndex]); } return syllables; } catch { return unicodeString; } } 
+4
source share
1 answer

You will have to decompose Korean syllables and save them in a separate column in SQL-db (for example, γ…ˆγ…“ γ…‡γ…… γ…œγ…‡ γ…•γ„΄ for μ •μˆ˜μ—°). I suggest you write a small user application that parses your db, decomposes all Korean syllables, and saves the results in a separate column.

EDIT

Here is some Python code that decomposes Hangul syllables:

 #!/usr/local/bin/python # -*- coding: utf8 -*- import codecs, sys, os, math JLT="γ„±,γ„²,γ„΄,γ„·,γ„Έ,γ„Ή,ㅁ,γ…‚,γ…ƒ,γ……,γ…†,γ…‡,γ…ˆ,γ…‰,γ…Š,γ…‹,γ…Œ,ㅍ,γ…Ž".split(",") JTT=",γ„±,γ„²,γ„±γ……,γ„΄,γ„΄γ…ˆ,γ„΄γ…Ž,γ„·,γ„Ή,γ„Ήγ„±,ㄹㅁ,γ„Ήγ…‚,γ„Ήγ……,γ„Ήγ…Œ,ㄹㅍ,γ„Ήγ…Ž,ㅁ,γ…‚,γ…‚γ……,γ……,γ…†,γ…‡,γ…ˆ,γ…Š,γ…‹,γ…Œ,ㅍ,γ…Ž".split(",") JVT="ㅏ,ㅐ,γ…‘,γ…’,γ…“,γ…”,γ…•,γ…–,γ…—,γ…˜,γ…™,γ…š,γ…›,γ…œ,ㅝ,γ…ž,γ…Ÿ,γ… ,γ…‘,γ…’,γ…£".split(",") SBase=0xAC00 SCount=11172 TCount=28 NCount=588 def HangulName(a): b=a.decode('utf8') sound='' for i in b: cp=ord(i) SIndex = cp - SBase if (0 > SIndex or SIndex >= SCount): # "Not a Hangul Syllable" pass LIndex = int(math.floor(SIndex / NCount)) VIndex = int(math.floor((SIndex % NCount) / TCount)) TIndex = int(SIndex % TCount) sound=sound+(JLT[LIndex] + JVT[VIndex] + JTT[TIndex]).lower() return sound print HangulName("μ •μˆ˜μ—°") 

dda $ python test.py
γ…ˆγ…“ γ…‡γ…… γ…œγ…‡ γ…•γ„΄

+2
source

Source: https://habr.com/ru/post/1436109/


All Articles