I understand that incl inserts an element into an existing set? If so, where all the real work is going on.
A union definition is a set that includes everything in any input set. If two sets are saved as binary trees, if you take the unions of the first set with the branches of the second, the only element that may not be the result is the element in the root directory of the node of the second tree, so if you insert this element, you have a union of both input sets.
This is just a very inefficient way to insert each element from both sets into a new set that starts empty. Presumably duplicates are discarded incl , so the result is a union of the two inputs.
Perhaps this would help to ignore the tree structure at the moment; this is not very important for the main algorithm. Say we have abstract math sets. Given a set of input with unknown elements, we can do two things:
- Add an element to it (which does nothing if the element is already present)
- Check if the set is not empty and, if so, decompose it into one element and two disjoint subsets.
To take the union of the two sets {1,2} and {2,3}, we start by decomposing the first set into element 1 and the subsets {} and {2}. We recursively use the union of {}, {2} and {2,3} using the same process, then insert 1 into the result.
At each step, the problem boils down from one join operation to two joint operations on the smaller inputs; standard separation and rest algorithm. Upon reaching the union of the one-point set {x} and the empty set {}, the union is trivial {x}, which then returns back to the chain.
The tree structure is used only to allow case analysis / decomposition into smaller sets and to make the insertion more efficient. The same can be done using other data structures, such as lists, bisected for decomposition and with an insert made with an exhaustive check for uniqueness. To effectively use a union, an algorithm is needed that is a little smarter and uses the structure used to store items.