How to simultaneously go through two arbitrarily complex tree structures and create a superset?

I have two tree structures that represent snapshots of the directory structure at two different points in time. Directories may have been added, deleted, or changed between snapshots. I need to walk along two trees at the same time and note newer the differences between the two - that is, the flag nodes - new, changed, deleted, unchanged, adding any deleted nodes, so the end result is a complete superset of the two snapshots.

Typically, the trees are likely to be about 10 deep, but very wide, containing hundreds of thousands, potentially millions of nodes. I want to skip large chunks of trees by comparing the hash codes on each node and only continuing to recurs where the codes do not match.

Is there an algorithm that could be my friend here? Any other tips?

+3
source share
2 answers

Imagine you are expanding each tree into a sorted list of files and directories. The method could receive the next input from each expandable tree from the intern for this tree. Then I could compare the hash codes and skip ahead in one tree, delete notes and change notes.

+1
source

" XML, " , , :

1) rsync , . http://samba.anu.edu.au/ftp/rsync/rsync.html, , , , rsync -list-, .

2) , , , . - (http://en.wikipedia.org/wiki/Rolling_hash).

, , diff xdelta , , . , - .

+1

Source: https://habr.com/ru/post/1728686/


All Articles