Search for the Korean alphabet index (any Unicode char) in Korean Word (any Unicode Word) in SQL Server

Question

Search for the Korean alphabet index (any Unicode char) in Korean Word (any Unicode Word) in SQL Server

I have a requirement to search for people by name. Here, the names of people can be in English, Korean or Chinese. To do this, I used the Like condition to search based on Name , as shown below:

 select * from [MyTable] where Name like N'%t%'

The above operator provides all users that contain the letter t . But this does not work with Korean or Chinese. For example, if I search with the Korean letter ㅈ , then it must provide all the names containing this letter, like **정수연, 재훈아이팟, 정원혁 테스트 7** . I tried the following methods, but it gave zero results

 select * from [MyTable] where Name like N'%ㅈ%' - No Results select PATINDEX(N'%ㅈ%',N'정수연(Mohan)') - giving value as ZERO select Charindex(N'ㅈ',N'정수연') - giving value as ZERO

Is there a way to find the alphabets of other languages on a SQL server?

I know how to find the existence of an alphabet in another language in C # words using coding methods, but not on a SQL server. Please help me in this regard.

Thanks in advance.

EDIT for C # code

 public static string DecomposeSyllabels(string unicodeString) { try { //Consonant consonant only used string[] JLT = { "ㄱ", "ㄲ", "ㄴ", "ㄷ", "ㄸ", "ㄹ", "ㅁ", "ㅂ", "ㅃ", "ㅅ", "ㅆ", "ㅇ", "ㅈ", "ㅉ", "ㅊ", "ㅋ", "ㅌ", "ㅍ", "ㅎ" }; // Only used a collection of neutral string[] JVT = { "ㅏ", "ㅐ", "ㅑ", "ㅒ", "ㅓ", "ㅔ", "ㅕ", "ㅖ", "ㅗ", "ㅘ", "ㅙ", "ㅚ", "ㅛ", "ㅜ", "ㅝ", "ㅞ", "ㅟ", "ㅠ", "ㅡ", "ㅢ", "ㅣ" }; // Initial and coda consonants used in string[] JTT = { "", "ㄱ", "ㄲ", "ㄳ", "ㄴ", "ㄵ", "ㄶ", "ㄷ", "ㄹ", "ㄺ", "ㄻ", "ㄼ", "ㄽ", "ㄾ", "ㄿ", "ㅀ", "ㅁ", "ㅂ", "ㅄ", "ㅅ", "ㅆ", "ㅇ", "ㅈ", "ㅊ", "ㅋ", "ㅌ", "ㅍ", "ㅎ" }; double SBase = 0xAC00; long SCount = 11172; int TCount = 28; int NCount = 588; string syllables = string.Empty; foreach (char c in unicodeString) { double SIndex = (int)c - SBase; if (0 > SIndex || SIndex >= SCount) { syllables = syllables + c; continue; } int LIndex = (int)Math.Floor(SIndex / NCount); int VIndex = (int)(Math.Floor((SIndex % NCount) / TCount)); int TIndex = (int)(SIndex % TCount); syllables = syllables + (JLT[LIndex] + JVT[VIndex] + JTT[TIndex]); } return syllables; } catch { return unicodeString; } }

+4

sql-server unicode sql-server-2008-r2 cjk

Mohan 25 sept. '12 at 9:26

source share

1 answer

dda · Accepted Answer · 2012-10-07T03:48:05+0000

You will have to decompose Korean syllables and save them in a separate column in SQL-db (for example, ㅈㅓ ㅇㅅ ㅜㅇ ㅕㄴ for 정수연). I suggest you write a small user application that parses your db, decomposes all Korean syllables, and saves the results in a separate column.

EDIT

Here is some Python code that decomposes Hangul syllables:

 #!/usr/local/bin/python # -*- coding: utf8 -*- import codecs, sys, os, math JLT="ㄱ,ㄲ,ㄴ,ㄷ,ㄸ,ㄹ,ㅁ,ㅂ,ㅃ,ㅅ,ㅆ,ㅇ,ㅈ,ㅉ,ㅊ,ㅋ,ㅌ,ㅍ,ㅎ".split(",") JTT=",ㄱ,ㄲ,ㄱㅅ,ㄴ,ㄴㅈ,ㄴㅎ,ㄷ,ㄹ,ㄹㄱ,ㄹㅁ,ㄹㅂ,ㄹㅅ,ㄹㅌ,ㄹㅍ,ㄹㅎ,ㅁ,ㅂ,ㅂㅅ,ㅅ,ㅆ,ㅇ,ㅈ,ㅊ,ㅋ,ㅌ,ㅍ,ㅎ".split(",") JVT="ㅏ,ㅐ,ㅑ,ㅒ,ㅓ,ㅔ,ㅕ,ㅖ,ㅗ,ㅘ,ㅙ,ㅚ,ㅛ,ㅜ,ㅝ,ㅞ,ㅟ,ㅠ,ㅡ,ㅢ,ㅣ".split(",") SBase=0xAC00 SCount=11172 TCount=28 NCount=588 def HangulName(a): b=a.decode('utf8') sound='' for i in b: cp=ord(i) SIndex = cp - SBase if (0 > SIndex or SIndex >= SCount): # "Not a Hangul Syllable" pass LIndex = int(math.floor(SIndex / NCount)) VIndex = int(math.floor((SIndex % NCount) / TCount)) TIndex = int(SIndex % TCount) sound=sound+(JLT[LIndex] + JVT[VIndex] + JTT[TIndex]).lower() return sound print HangulName("정수연")

dda $ python test.py
ㅈㅓ ㅇㅅ ㅜㅇ ㅕㄴ

Search for the Korean alphabet index (any Unicode char) in Korean Word (any Unicode Word) in SQL Server

EDIT

More articles: