I have two tree structures that represent snapshots of the directory structure at two different points in time. Directories may have been added, deleted, or changed between snapshots. I need to walk along two trees at the same time and note newer the differences between the two - that is, the flag nodes - new, changed, deleted, unchanged, adding any deleted nodes, so the end result is a complete superset of the two snapshots.
Typically, the trees are likely to be about 10 deep, but very wide, containing hundreds of thousands, potentially millions of nodes. I want to skip large chunks of trees by comparing the hash codes on each node and only continuing to recurs where the codes do not match.
Is there an algorithm that could be my friend here? Any other tips?
flesh source
share