Replace Unicode characters in T-SQL

Question

Replace Unicode characters in T-SQL

How to replace only the last character of a string:

select REPLACE('this is the news with a þ', 'þ', '__')

As a result, I get:

__is is __e news wi__ a __

EDIT Server and database mapping Latin1_General_CI_AS

The actual query I run is REPLACE(note, 'þ', '') , where note is the ntext column. The point is to cut spike characters, because this character is used later in the process as a column delimiter. (Please do not suggest changing the delimiter that just won't happen, considering how much it was used!)

I tried using the N prefix, even using the select select statement, here are the results:

Still broken!

+3

sql sql-server tsql unicode collation

Sean Mar 12 '15 at 14:46

source share

2 answers

This might work for you:

 DECLARE @text NVARCHAR(1000) = N'this is the news with a þ'; DECLARE @find NVARCHAR(1000) = N'þ'; DECLARE @replace NVARCHAR(1000) = N'_'; SELECT REPLACE(CAST(@text AS VARCHAR), CAST(@find AS VARCHAR), CAST(@replace AS VARCHAR));

0

Dmitrij Kultasev Mar 12 '15 at 14:54

source share

Solomon rutzky · Accepted Answer · 2015-03-12T14:57:47+0000

The symbol þ (ASCII and UNICODE 254) is known as the spike, and in some languages, th :

Character technical information here: http://unicode-table.com/en/00FE/
An explanation of this symbol and collisions is here: http://userguide.icu-project.org/collation/customization . Locate the page — usually Control-F — for “Examples of Complex Tailors,” and you will see the following:
The letter 'þ' (THORN) is usually treated by sorting UCA / root as a separate letter, which is sorted at the entry level after 'z'. However, in Swedish and some other Scandinavian languages, “þ” and “Þ” should be regarded as the difference in the tertiary level from the letters “th” and “TH”, respectively.

If you do not want þ to be equal to th , then force binary sorting as follows:

 SELECT REPLACE(N'this is the news with a þ' COLLATE Latin1_General_100_BIN2, N'þ', N'__');

Return:

 this is the news with a __

Replace Unicode characters in T-SQL

More articles: