How to check if the first five characters of one field correspond to another?

Assuming I have the following table:

AAAAAA AAAAAB CCCCCC 

How can I create a query that lets me know that AAAAA and AAAAB are similar (since they have five consecutive characters)? Ideally, I would like to write this as a query that would check if the two fields shared the five characters in a string anywhere in the string, but is that outside the scope of SQL and something that I have to write to a C # application?

Ideally, the query will add another column that displays: Similar to 'AAAAA', 'AAAAB'

+4
source share
3 answers

I suggest you not try to break 1NF by introducing a multi-valued attribute.

Noting that SUBSTRING very portable:

 WITH T AS ( SELECT * FROM ( VALUES ('AAAAAA'), ('AAAAAB'), ('CCCCCC') ) AS T (data_col) ) SELECT T1.data_col, T2.data_col AS data_col_similar_to FROM T AS T1, T AS T2 WHERE T1.data_col < T2.data_col AND SUBSTRING(T1.data_col, 1, 5) = SUBSTRING(T2.data_col, 1, 5); 

Alternativvely:

 T1.data_col LIKE SUBSTRING(T2.data_col, 1, 5) + '%'; 
+7
source

This will find all matches, as well as those that are in the middle of the word, it will not work well on a large table

 declare @t table(a varchar(20)) insert @t select 'AAAAAA' insert @t select 'AAAAAB' insert @t select 'CCCCCC' insert @t select 'ABCCCCC' insert @t select 'DDD' declare @compare smallint = 5 ;with cte as ( select a, left(a, @compare) suba, 1 h from @t union all select a, substring(a, h + 1, @compare), h+1 from cte where cte.h + @compare <= len(a) ) select ta, cte.a match from @tt -- if you don't want the null matches, remove the 'left' from this join left join cte on charindex(suba, ta) > 0 and ta <> cte.a group by ta, cte.a 

Result:

 a match -------------------- ------ AAAAAA AAAAAB AAAAAB AAAAAA ABCCCCC CCCCCC CCCCCC ABCCCCC 
+3
source

You can use left to compare the first five characters, and you can use for xml path to combine similar lines into one column.

 declare @T table ( ID int identity primary key, Col varchar(10) ) insert into @T values ('AAAAAA'), ('AAAAAB'), ('AAAAAC'), ('CCCCCC') select Col, stuff((select ','+T2.Col from @T as T2 where left(T1.Col, 5) = left(T2.Col, 5) and T1.ID <> T2.ID for xml path(''), type).value('.', 'varchar(max)'), 1, 1, '') as Similar from @T as T1 

Result:

 Col Similar ---------- ------------------------- AAAAAA AAAAAB,AAAAAC AAAAAB AAAAAA,AAAAAC AAAAAC AAAAAA,AAAAAB CCCCCC NULL 
+2
source

Source: https://habr.com/ru/post/1394544/


All Articles