Goal: Want to effectively find all disabled graphs from a large set of sets
For example, I have a data file, for example:
A, B, C
C, D, E
A, F, Z
G, J
...
Each entry is a collection of items. First entries A, B, C = {A, B, C} This also indicates that there is an edge between A and B, A and C, B and C.
I first developed an algorithm
1.parse all the entries into a list:
[
{A,B,C}
{C,D,E}
...
]
2.start with the first element/set of the list can called start_entry, {A,B,C} in this case
3.traverse other element in the list and do the following:
if the intersection of the element and start_entry is not empty
start_entry = start_entry union with the element
remove element from the list
4.with the updated start_entry, traverse the list again until there is not new update
The algorithm above should return a list of vertices of a connected graph. However, I ran into a runtime issue due to the size of the dataset. There are ~ 100,000 entries. So I just wonder if anyone knows if there is a more efficient way to find a related graph.
The data structure can also be changed (if it is easier) A, B BEFORE ERA E, F ... and each record represents the edge of the graph.