How to define groups / clusters in a set of arcs / edges in SQL?

I have arcs / edges similar to these properties:

Node1   Node2
A       B
B       C
D       E

Here A is connected to B and B to C. D is connected to E. In other words, there are 2 groups / clusters shown on these “wants:

Node1   Node2   Cluster
A       B       1
B       C       1
D       E       2

Can SQL be used to identify these groups / clusters? I assume this is due to self-joins, but I don't see how to write this SQL. Any feedback would be greatly appreciated. Thank!

+4
source share
2 answers

Assuming that from the sample of data presented in the question that there are no loops, as A->B->C->Ain the data, the following is a query that returns the desired result from the nodestable.

WITH RECURSIVE NodeCluster (node1,node2,Cluster1) AS
      (SELECT node1,
              node2,
              Rank() Over (
                           ORDER BY node1)
       FROM nodes AS n1
       WHERE NOT EXISTS
           (SELECT *
            FROM nodes AS n2
            WHERE n1.node1 = n2.node2)
       UNION ALL SELECT N1.node1,
                        N1.node2,
                        NodeCluster.Cluster1
       FROM nodes n1,
            NodeCluster
       WHERE NodeCluster.node2=n1.node1 )
    SELECT *
    FROM NodeCluster
    ORDER BY Cluster1,
             node1,
             node2;

ranked asc, .

.

Node1 | Node2  | Cluster1
-------------------------
A       B        1
B       C        1
D       E        2

, .

Node1 | Node2
-------------
A        B   
B        C   
D        E   
E        F   
F        G   
H        I   
I        J   
J        K   
L        M  

.

Node1 | Node2  | Cluster1
-------------------------
A       B         1
B       C         1
D       E         2
E       F         2
F       G         2
H       I         3
I       J         3
J       K         3
L       M         4

Teradata SQL Assistant, bteq teradata ANSI.

, .

+1

:

    DECLARE @Edges TABLE
    (
      ID INT ,
      Node1 CHAR(1) ,
      Node2 CHAR(1)
    );
INSERT  INTO @Edges
        ( Node1, Node2 )
VALUES  ( 'A', 'B' ),
        ( 'B', 'C' ),
        ( 'D', 'E' );
WITH    CTE_Nodes
          AS ( SELECT   Node1 AS Node
               FROM     @Edges
               UNION
               SELECT   Node2 AS Node
               FROM     @Edges
             ),
        CTE_Pairs
          AS ( SELECT   Node1 ,
                        Node2
               FROM     @Edges
               WHERE    Node1 <> Node2
               UNION
               SELECT   Node2 AS Node1 ,
                        Node1 AS Node2
               FROM     @Edges
               WHERE    Node1 <> Node2
             ),
        CTE_Recursive
          AS ( SELECT   CAST(CTE_Nodes.Node AS VARCHAR(8000)) AS AnchorNode ,
                        Node1 ,
                        Node2 ,
                        CAST(',' + Node1 + ',' + Node2 + ',' AS VARCHAR(8000)) AS NodePath ,
                        1 AS Lvl
               FROM     CTE_Pairs
                        INNER JOIN CTE_Nodes ON CTE_Nodes.Node = CTE_Pairs.Node1
               UNION ALL
               SELECT   CTE_Recursive.AnchorNode ,
                        CTE_Pairs.Node1 ,
                        CTE_Pairs.Node2 ,
                        CAST(CTE_Recursive.NodePath + CTE_Pairs.Node2 + ',' AS VARCHAR(8000)) AS NodePath ,
                        CTE_Recursive.Lvl + 1 AS Lvl
               FROM     CTE_Pairs
                        INNER JOIN CTE_Recursive ON CTE_Recursive.Node2 = CTE_Pairs.Node1
               WHERE    CTE_Recursive.NodePath NOT LIKE CAST('%,'
                        + CTE_Pairs.Node2 + ',%' AS VARCHAR(8000))
             ),
        CTE_RecursionResult
          AS ( SELECT   AnchorNode ,
                        Node1 ,
                        Node2
               FROM     CTE_Recursive
             ),
        CTE_CleanResult
          AS ( SELECT   AnchorNode ,
                        Node1 AS Node
               FROM     CTE_RecursionResult
               UNION
               SELECT   AnchorNode ,
                        Node2 AS Node
               FROM     CTE_RecursionResult
             )
    SELECT  Edges.Node1 ,
            Edges.Node2 ,
            DENSE_RANK() OVER ( ORDER BY CASE WHEN CA_Data.XML_Value IS NULL
                                              THEN Edges.Node1
                                              ELSE CA_Data.XML_Value
                                         END ) AS Cluster
    FROM    @Edges Edges
            CROSS APPLY ( SELECT    CTE_CleanResult.Node + ','
                          FROM      CTE_CleanResult
                          WHERE     CTE_CleanResult.AnchorNode = Edges.Node1
                          ORDER BY  CTE_CleanResult.Node
                        FOR
                          XML PATH('') ,
                              TYPE
                        ) AS CA_XML ( XML_Value )
            CROSS APPLY ( SELECT    CA_XML.XML_Value.value('.',
                                                           'NVARCHAR(MAX)')
                        ) AS CA_Data ( XML_Value );

SQL Fiddle DEMO

+1

Source: https://habr.com/ru/post/1681859/


All Articles