Combining multiple rows into a list for multiple columns

I know that the question "Combine several rows into a list" was answered a million times, and here is a link to an amazing article: Concatenating a string of value in transact sql

I need to combine multiple rows into lists for multiple columns at once

ID | Col1 | Col2 ID | Col1 | Col2 ------------------ => ------------------ 1 AX 1 AX 2 BY 2 B,CY,Z 2 CZ 

I tried using the xml method, but it turned out to be very slow in large tables

 SELECT DISTINCT [ID], [Col1] = STUFF((SELECT ',' + t2.[Col1] FROM #Table t2 WHERE t2.ID = t.ID FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''), [Col2] = STUFF((SELECT ',' + t2.[Col2] FROM #Table t2 WHERE t2.ID = t.ID FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'),1,1,''), FROM #Table t 

My current solution is to use a stored procedure that builds each line of the identifier separately. I am wondering if there is another approach I could use (other than using a loop)

 For each column, rank the rows to combine (partition by the key column) End up with a table like ID | Col1 | Col2 | Col1Rank | Col2Rank 1 AX 1 1 2 BY 1 1 2 CZ 2 2 Create a new table containing top rank columns for each ID ID | Col1Comb | Col2Comb 1 AX 2 BY Loop through each remaining rank in increasing order (in this case 1 iteration) for irank = 0; irank <= 1; irank++ update n set n.col1Comb = n.Col1Comb + ',' + o.Col1, -- so append the rank 2 items n.col2comb = n.Col2Comb + ',' + o.Col2 -- if they are not null from #newtable n join #oldtable o on o.ID = n.ID where o.col1rank = irank or o.col2rank = irank 
+4
source share
3 answers

When upgrading CTE, the CTE trick can be used.

Method 1: a new parallel table into which data is copied and then concatenated:

 CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), RowID INT IDENTITY(1,1)); CREATE TABLE #Table1Concat(ID INT, Col3 VARCHAR(MAX), Col4 VARCHAR(MAX), RowID INT); GO INSERT #Table1 VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z'); GO INSERT #Table1Concat SELECT * FROM #Table1; GO DECLARE @Cat1 VARCHAR(MAX) = ''; DECLARE @Cat2 VARCHAR(MAX) = ''; ; WITH CTE AS ( SELECT TOP 2147483647 t1.*, t2.Col3, t2.Col4, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2) FROM #Table1 t1 JOIN #Table1Concat t2 ON t1.RowID = t2.RowID ORDER BY t1.ID, t1.Col1, t1.Col2 ) UPDATE CTE SET @Cat1 = Col3 = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE @Cat1 + ',' + Col1 END , @Cat2 = Col4 = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE @Cat2 + ',' + Col2 END; GO SELECT ID, Col3 = MAX(Col3) , Col4 = MAX(Col4) FROM #Table1Concat GROUP BY ID 

Method 2 : add the concatenation columns directly to the source table and join the new columns:

 CREATE TABLE #Table1(ID INT, Col1 VARCHAR(1), Col2 VARCHAR(1), Col1Cat VARCHAR(MAX), Col2Cat VARCHAR(MAX)); GO INSERT #Table1(ID,Col1,Col2) VALUES(1,'A','X'), (2,'B','Y'), (2,'C','Z'); GO DECLARE @Cat1 VARCHAR(MAX) = ''; DECLARE @Cat2 VARCHAR(MAX) = ''; ; WITH CTE AS ( SELECT TOP 2147483647 t1.*, r = ROW_NUMBER()OVER(PARTITION BY t1.ID ORDER BY t1.Col1, t1.Col2) FROM #Table1 t1 ORDER BY t1.ID, t1.Col1, t1.Col2 ) UPDATE CTE SET @Cat1 = Col1Cat = CASE r WHEN 1 THEN ISNULL(Col1,'') ELSE @Cat1 + ',' + Col1 END , @Cat2 = Col2Cat = CASE r WHEN 1 THEN ISNULL(Col2,'') ELSE @Cat2 + ',' + Col2 END; GO SELECT ID, Col1Cat = MAX(Col1Cat) , Col2Cat = MAX(Col2Cat) FROM #Table1 GROUP BY ID; GO 
+3
source

Try this option -

Query1:

 DECLARE @temp TABLE ( ID INT , Col1 VARCHAR(30) , Col2 VARCHAR(30) ) INSERT INTO @temp (ID, Col1, Col2) VALUES (1, 'A', 'X'), (2, 'B', 'Y'), (2, 'C', 'Z') SELECT r.ID , Col1 = STUFF(REPLACE(REPLACE(CAST(dxquery('/t1/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '') , Col2 = STUFF(REPLACE(REPLACE(CAST(dxquery('/t2/a') AS VARCHAR(MAX)), '<a>', ','), '</a>', ''), 1, 1, '') FROM ( SELECT DISTINCT ID FROM @temp ) r OUTER APPLY ( SELECT x = CAST(( SELECT [t1/a] = t2.Col1 , [t2/a] = t2.Col2 FROM @temp t2 WHERE r.ID = t2.ID FOR XML PATH('') ) AS XML) ) d 

Request 2:

 SELECT r.ID , Col1 = STUFF(REPLACE(CAST(dxquery('for $a in /a return xs:string($a)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '') , Col2 = STUFF(REPLACE(CAST(dxquery('for $b in /b return xs:string($b)') AS VARCHAR(MAX)), ' ,', ','), 1, 1, '') FROM ( SELECT DISTINCT ID FROM @temp ) r OUTER APPLY ( SELECT x = CAST(( SELECT [a] = ',' + t2.Col1 , [b] = ',' + t2.Col2 FROM @temp t2 WHERE r.ID = t2.ID FOR XML PATH('') ) AS XML) ) d 

Output:

 ID Col1 Col2 ----------- ---------- ---------- 1 AX 2 B,CY,Z 
+1
source

One solution, which is at least syntactically straightforward, is to use a user-aggregated aggregate to combine values ​​together. This requires SQLCLR, and although some people do not want to enable it, it provides a set-based approach that does not require retrying the base table for each column. A join is the opposite of Splitting and will create a comma-separated list of individual lines.

Below is a simple example that uses the SQL # library (SQLsharp), which comes with a custom aggregate called Agg_Join () that does exactly what is requested here. You can download the free version of SQL # from http://www.SQLsharp.com/ and the SELECT example from the standard system view. (And, to be honest, I am the author of SQL #, but this feature is available for free).

 SELECT sc.[object_id], OBJECT_NAME(sc.[object_id]) AS [ObjectName], SQL#.Agg_Join(sc.name) AS [ColumnNames], SQL#.Agg_Join(DISTINCT sc.system_type_id) AS [DataTypes] FROM sys.columns sc GROUP BY sc.[object_id] 

I recommend testing this against your current solution (s) to find out what is the fastest for the amount of data you expect to have at least in the next year or two.

0
source

Source: https://habr.com/ru/post/1481085/


All Articles