What is the fastest way to compare two text files, not counting the moved lines as different

I have two files with a size of very 50,000 lines. I need to compare these two files and identify the changes. However, the catch in the event that the string is present in a different position, it can not be displayed as different.

For example, consider this A.txt file

xxxxx
yyyyy
zzzzz    

B.txt File

zzzzz
xxxx
yyyyy  

So, if this is the contents of the file. My code should give the result as xxxx (or both xxxx and xxxxx).

Of course, the easiest way is to save each line of the file to

List< String>

and comparing with another

List< String>.

But it seems to take a lot of time. I also tried using DiffUtils in java. But it does not recognize lines present in different line numbers, as such. So is there any other algorithm that could help me?

+4
7

, Set :

Set<String> set1 = new HashSet<String>(FileUtils.readLines(file1));

Set<String> set2 = new HashSet<String>(FileUtils.readLines(file2));


Set<String> similars = new HashSet<String>(set1);

similars.retainAll(set2);

set1.removeAll(similars); //now set1 contains distinct lines in file1
set2.removeAll(similars); //now set2 contains distinct lines in file2
System.out.println(set1); //prints distinct lines in file1;
System.out.println(set2); //prints distinct lines in file2
+1

HashSet , , :

  • HashSet .

  • Trie

HashSets Tries - Trie ( )?

+2

, . , A B, .

, :

Multiset , , ​​( - , ). "" , . (MultiSet ).

, "" , , . , + 1 . , 1. " ", . , -1. 0, .

  • .
  • , .
  • , (. , )
  • , (. )
  • , .

, , .

, , , , , , , .

+1

, ,

   BufferedReader reader1 = new BufferedReader(new FileReader("C:\\file1.txt"));

    BufferedReader reader2 = new BufferedReader(new FileReader("C:\\file2.txt"));

    String line1 = reader1.readLine();

    String line2 = reader2.readLine();

    boolean areEqual = true;

    int lineNum = 1;

    while (line1 != null || line2 != null)
    {
        if(line1 == null || line2 == null)
        {
            areEqual = false;

            break;
        }
        else if(! line1.equalsIgnoreCase(line2))
        {
            areEqual = false;

            break;
        }

        line1 = reader1.readLine();

        line2 = reader2.readLine();

        lineNum++;
    }

    if(areEqual)
    {
        System.out.println("Two files have same content.");
    }
    else
    {
        System.out.println("Two files have different content. They differ at line "+lineNum);

        System.out.println("File1 has "+line1+" and File2 has "+line2+" at line "+lineNum);
    }

    reader1.close();

    reader2.close();
0

, HashMap, .

O (n).

-1

BufferedReader. . . , .

Or just use FileUtils.contentEquals(file1, file2);from org.apache.commons.io.FileUtils.

-1
source

You can use FileUtils.contentEquals (file1, file2)

It will compare the contents of two files.

Find more information here.

-1
source

Source: https://habr.com/ru/post/1607274/


All Articles