First, you can try to find duplicate linear hashes to identify potential duplicate lines:
Map<Integer, Integer> hashes = new HashMap<> (); Map<Integer, Integer> dupes = new HashMap<> (); int i = 0; while ((line = buff.readLine()) != null) { int hash = line.hashCode(); Integer previous = hashes.get(hash); if (previous != null) {
At the end, you have a list of potential duplicates. If dupes empty, there were no duplicates; if it is not, you can make a second pass in the file to check if the lines are really identical.
source share