I would recommend using md5deep or sha1deep . On Linux, just install the md5deep package (it is included in most Linux distributions).
After you install it, just run it in recursive mode on the entire disk and save the checksums for each file on your disk into a text file using the following command:
md5deep -r -l . > filelist.txt
If you like sha1 better than md5 , use sha1deep instead (it is part of the same package).
Once you have the file, just sort it with sort (or move it to sort in the previous step):
sort < filelist.txt > filelist_sorted.txt
Now just view the result using any text editor - you will quickly see all the duplicates along with their locations on the disk.
If you are so prone, you can write a simple script in Perl or Python to remove duplicates based on this list of files.
source share