Identification of graphs in a heap of related nodes - what is it called?

I have an SQL table with three columns X, Y, Z. I need to split it into groups so that all records with the same X or Y or Z value are assigned to the same group. I need to make sure that records with the same value of X or Y or Z never break into multiple groups.

If you consider records as nodes and X, Y, Z values ​​as edges, this problem is the same as finding all the graphs, where the nodes in each graph will be connected directly or indirectly via X, Y or Z-edge, but each graph will not have any edges along with other graphs (otherwise it will be part of the same graph).

A few years ago, I knew that it was called and even remembered the algorithm, but now it eludes me. Please tell me how this problem is called so that I can find a solution to Google. If you have a good algorithm now, please indicate it to me. If you have a SQL implementation - I will marry you :)

Example:

XYZ BUCKET --------- ---------------- --------- ----------- 1 34 56 1 54 43 45 2 1 12 22 1 2 34 11 1 

The last row is in bucket 1 because of the value Y = 34, which matches the value of the first row, which is in bucket 1.

+1
source share
3 answers

It does not look like a graph, but rather a simplicial complex . But if we consider this complex as its skeletal graph (numbers are considered as vertices, and a row in the table means that all three vertices are connected by an edge), then we can simply use any algorithm to find the connected components of this graph. I'm not sure if there is a way to do this in SQL, although it might be wiser to use a graph database somehow.

However, for this specific task, there might be some simple solution available with SQL that I have not been looking for.

+2
source

to find out how many nodes in each group x:

 select x, count(x) from mytable group by x 

or find the list of sets x:

 select distinct x from mytable; 
0
source

Why don't you initially select GROUP BY one of the columns (for example, X), make buckets, and then do it for Y and Z, every time you combine all the buckets from the previous step, if you find new groups.

Repeat the process for X, Y, and Z until the buckets stop changing.

Do you work on connected or facebook? :)

0
source

Source: https://habr.com/ru/post/920448/


All Articles