I have two versions of a very large and complex directory structure with tens of thousands of separate files, and I want to look for significant file changes from one version to another.
Each file has changed slightly. For example, you might have a file called intro.txt that will contain
[Build 1057 done by Mike 12:00] - (version 1)
[Build 1065 made by Mike 18:10] - (version 2)
I do not like such changes because they do not contain any useful information. I also don't need corrections for spelling mistakes or adding a word or two.
What I really want to do is pull out files that have changed in a more important way. One of the ways they may have changed is to add a lot of additional content that will increase the file size - which interests me the change.
So, how would you recursively analyze directories that searched for files that increased (or decreased) by a given amount from one version to another.
I am running linux, but in any case, any language will work.
source
share