Is there a friendly name for this data structure?

While working on a function selector for a machine learning algorithm in Python, I created a data structure with the following code:

# Perform set partitioning on the results groups = [] for t in results: (jthName,kthName) = t jthGroup = -1 kthGroup = -1 # Just a simple list of hashes with online merging for idx,group in enumerate(groups): if jthName in group: jthGroup = idx if kthName in group: kthGroup = idx if jthGroup == kthGroup: if jthGroup == -1: # Implicit: "and kthGroup == -1" groups.append(set((jthName,kthName))) elif jthGroup != kthGroup: if kthGroup == -1: # Merge kthName into jthGroup groups[jthGroup].add(kthName) elif jthGroup == -1: # Merge jthName into kthGroup (redundant if naturally-ordered) groups[kthGroup].add(jthName) else: # Merge jthGroup and kthGroup, since we have a connecting pair merged = set() merged.update(groups[jthGroup]) merged.update(groups[kthGroup]) groups.remove(groups[jthGroup]) groups.remove(groups[kthGroup]) groups.append(merged) 

Where my input, results , is a list of tuples {2} and groups is a list of tuples. Please note that my code here is not necessarily effective; it is provided for illustrative purposes only.

My groups data structure has the following properties:

  • For each (jthName,kthName) :

    • If no item from (jthName,kthName) is found in any contained set, create set((jthName,kthName)) in our list of sets.
    • If only one of (jthName,kthName) is in one contained set, combine the unreasonable element into this set.
    • If each item from (jthName,kthName) is in a different set, combine the two sets of links into one set.
  • Loop invariant: jthName and kthName cannot be contained in more than one set.


My excuse for this data structure is to create a flat decomposition of an unknown set of related graphs, where each unique element name is a node, and each unique pair is an edge. My rationale is that my graphs are incomplete, and I require that this representation select only the known members of each graph to feed into an algorithm that regressively determines the graph connectivity and direction of the edges (i.e. the complete set of DAGs expressed by data). But, I was distracted.

Is there a friendly name for the data structure represented by the groups variable? If this is so or not, is there more efficient time or space for doing this decomposition?

+4
source share
1 answer

I think what you are looking for is called a Data Structure with unrelated sets .

It is often used when executing Kruskal because it allows you to do n searches in amortized nlog * n (actually less than this) time if you implement a data structure with an unconnected network with path compression.

This is pretty reasonable to implement, and I think the pseudo-code of the wiki page is great for python. If you need more help, this question may help .

If you used a data structure with unrelated sets, your code would look like this:

 for t in results: (jName, kName) = t union(jName, kName) 
+7
source

Source: https://habr.com/ru/post/1399908/


All Articles