I am currently working on a problem where some characters need to be cleared of the rows existing in the table. Normally I would do a simple UPDATE with a replacement, but in this case there are 32 different characters that need to be removed.
I looked around a bit, and I can’t find great solutions to quickly clear the rows that already exist in the table.
Things I learned:
Running a series of nested notes
This solution works, but for 32 different notes, you will need either some ugly code or hacker dynamic sql to create a huge series of replacements.
PATINDEX and while loops
As you can see from this answer , you can imitate the replacement of the regular expression, but I work with a lot of data, so I do not dare even trust the improved solution to run in a reasonable amount of time, when the amount of data is large.
Recursive CTE
I tried to apply CTE to this problem, but it did not work very quickly, as soon as the number of lines became large.
For reference:
CREATE TABLE #BadChar(
id int IDENTITY(1,1),
badString nvarchar(10),
replaceString nvarchar(10)
);
INSERT INTO #BadChar(badString, replaceString) SELECT 'A', '^';
INSERT INTO #BadChar(badString, replaceString) SELECT 'B', '}';
INSERT INTO #BadChar(badString, replaceString) SELECT 's', '5';
INSERT INTO #BadChar(badString, replaceString) SELECT '-', ' ';
CREATE TABLE #CleanMe(
clean_id int IDENTITY(1,1),
DirtyString nvarchar(20)
);
DECLARE @i int;
SET @i = 0;
WHILE @i < 100000 BEGIN
INSERT INTO #CleanMe(DirtyString) SELECT 'AAAAA';
INSERT INTO #CleanMe(DirtyString) SELECT 'BBBBB';
INSERT INTO #CleanMe(DirtyString) SELECT 'AB-String-BA';
SET @i = @i + 1
END;
WITH FixedString (Step, String, cid) AS (
SELECT 1 AS Step, REPLACE(DirtyString, badString, replaceString), clean_id
FROM #BadChar, #CleanMe
WHERE id = 1
UNION ALL
SELECT Step + 1, REPLACE(String, badString, replaceString), cid
FROM FixedString AS T1
JOIN #BadChar AS T2 ON T1.step + 1 = T2.id
Join #CleanMe AS T3 on T1.cid = t3.clean_id
)
SELECT String FROM FixedString WHERE step = (SELECT MAX(STEP) FROM FixedString);
DROP TABLE #BadChar;
DROP TABLE #CleanMe;
Use CLR
This seems to be a common solution that many people use, but the environment in which I work does not make it very easy to get started.
Are there any other ways for this to happen, I looked? Or are there any improvements in the methods that I have already reviewed for this?
source
share