Effectively clearing rows in a table

I am currently working on a problem where some characters need to be cleared of the rows existing in the table. Normally I would do a simple UPDATE with a replacement, but in this case there are 32 different characters that need to be removed.

I looked around a bit, and I can’t find great solutions to quickly clear the rows that already exist in the table.

Things I learned:

  • Running a series of nested notes

    This solution works, but for 32 different notes, you will need either some ugly code or hacker dynamic sql to create a huge series of replacements.

  • PATINDEX and while loops

    As you can see from this answer , you can imitate the replacement of the regular expression, but I work with a lot of data, so I do not dare even trust the improved solution to run in a reasonable amount of time, when the amount of data is large.

  • Recursive CTE

    I tried to apply CTE to this problem, but it did not work very quickly, as soon as the number of lines became large.

For reference:

CREATE TABLE #BadChar(
    id int IDENTITY(1,1),
    badString nvarchar(10),
    replaceString nvarchar(10)

);

INSERT INTO #BadChar(badString, replaceString) SELECT 'A', '^';
INSERT INTO #BadChar(badString, replaceString) SELECT 'B', '}';
INSERT INTO #BadChar(badString, replaceString) SELECT 's', '5';
INSERT INTO #BadChar(badString, replaceString) SELECT '-', ' ';

CREATE TABLE #CleanMe(
    clean_id int IDENTITY(1,1),
    DirtyString nvarchar(20)
);

DECLARE @i int;
SET @i = 0;
WHILE @i < 100000 BEGIN
    INSERT INTO #CleanMe(DirtyString) SELECT 'AAAAA';
    INSERT INTO #CleanMe(DirtyString) SELECT 'BBBBB';
    INSERT INTO #CleanMe(DirtyString) SELECT 'AB-String-BA';
    SET @i = @i + 1
END;


WITH FixedString (Step, String, cid) AS (
    SELECT 1 AS Step, REPLACE(DirtyString, badString, replaceString), clean_id
    FROM #BadChar, #CleanMe
    WHERE id = 1

    UNION ALL

    SELECT Step + 1, REPLACE(String, badString, replaceString), cid
    FROM FixedString AS T1
    JOIN #BadChar AS T2 ON T1.step + 1 = T2.id
    Join #CleanMe AS T3 on T1.cid = t3.clean_id

)
SELECT String FROM FixedString WHERE step = (SELECT MAX(STEP) FROM FixedString);

DROP TABLE #BadChar;
DROP TABLE #CleanMe;
  1. Use CLR

    This seems to be a common solution that many people use, but the environment in which I work does not make it very easy to get started.

Are there any other ways for this to happen, I looked? Or are there any improvements in the methods that I have already reviewed for this?

+4
source share
1 answer

Alan Burstein, - , bad/replace. / .

CREATE FUNCTION [dbo].[CleanStringV1]
(
  @String   nvarchar(4000)
)
RETURNS nvarchar(4000) WITH SCHEMABINDING AS 
BEGIN
 SELECT @string = REPLACE
  (
    @string COLLATE Latin1_General_BIN,
    badString,
    replaceString
  )
 FROM
 (VALUES
      ('A', '^')
    , ('B', '}')
    , ('s', '5')
    , ('-', ' ')
    ) t(badString, replaceString) 
 RETURN @string;
END;

, , bad/replace,

CREATE FUNCTION [dbo].[CleanStringV2]
(
  @String   nvarchar(4000)
)
RETURNS nvarchar(4000) AS 
BEGIN
 SELECT @string = REPLACE
  (
    @string COLLATE Latin1_General_BIN,
    badString,
    replaceString
  )
 FROM BadChar
 RETURN @string;
END;

. COLLATE, , . , , REPLACE. , , REPLACE.

+1

Source: https://habr.com/ru/post/1682153/


All Articles