I will need to execute Diffs between Java strings. I would like to be able to rebuild the string from the original version of the string and diff. Has anyone done this in Java? What library are you using?
String a1; // This can be a long text String a2; // ej. above text with spelling corrections String a3; // ej. above text with spelling corrections and an additional sentence Diff diff = new Diff(); String differences_a1_a2 = Diff.getDifferences(a,changed_a); String differences_a2_a3 = Diff.getDifferences(a,changed_a); String[] diffs = new String[]{a,differences_a1_a2,differences_a2_a3}; String new_a3 = Diff.build(diffs); a3.equals(new_a3); // this is true
This library seems to do the trick: google-diff-match-patch . It can create a patch string from differences and allow reuse of the patch.
to change . Another solution could be https://code.google.com/p/java-diff-utils/
Apache Commons has String diff
org.apache.commons.lang.StringUtils
StringUtils.difference("foobar", "foo");
As Torsten says, you can use
org.apache.commons.lang.StringUtils;
System.err.println(StringUtils.getLevenshteinDistance("foobar", "bar"));
The java diff utills library may be useful.
Use Levenshtein distance and extract the editing logs from the matrix that the algorithm creates. The Wikipedia article refers to a couple of implementations, I'm sure there is a Java implementation there.
Levenshtein is a special case of the Longest Common Subsequence algorithm, you can also look at it.
If you need to deal with the differences between large amounts of data and have efficient smoothing, you can try the Java xdelta implementation, which in turn implements RFC 3284 (VCDIFF) for binary differences (should also work with strings).
public class Stringdiff { public static void main(String args[]){ System.out.println(strcheck("sum","sumsum")); } public static String strcheck(String str1,String str2){ if(Math.abs((str1.length()-str2.length()))==-1){ return "Invalid"; } int num=diffcheck1(str1, str2); if(num==-1){ return "Empty"; } if(str1.length()>str2.length()){ return str1.substring(num); } else{ return str2.substring(num); } } public static int diffcheck1(String str1,String str2) { int i; String str; String strn; if(str1.length()>str2.length()){ str=str1; strn=str2; } else{ str=str2; strn=str1; } for(i=0;i<str.length() && i<strn.length();i++){ if(str1.charAt(i)!=str2.charAt(i)){ return i; } } if(i<str1.length()||i<str2.length()){ return i; } return -1; } }