you should have used \b(\w+)\b\s+\b\1\b
, click here to see the result ...
Hope this is what you want ...
Update 1
Okay, well, the result you have,
final row after duplicate removal
import java.util.regex.*; public class MyDup { public static void main (String args[]) { String input="This This is text text another another"; String originalText = input; String output = ""; Pattern p = Pattern.compile("\\b(\\w+)\\b\\s+\\b\\1\\b", Pattern.MULTILINE+Pattern.CASE_INSENSITIVE); Matcher m = p.matcher(input); System.out.println(m); if (!m.find()) output = "No duplicates found, no changes made to data"; else { while (m.find()) { if (output == "") { output = input.replaceFirst(m.group(), m.group(1)); } else { output = output.replaceAll(m.group(), m.group(1)); } } input = output; m = p.matcher(input); while (m.find()) { output = ""; if (output == "") { output = input.replaceAll(m.group(), m.group(1)); } else { output = output.replaceAll(m.group(), m.group(1)); } } } System.out.println("After removing duplicate the final string is " + output); }
Run this code and see what you get as output ... Your requests will be resolved ...
Note
In output
you replace the duplicate with one word ... Isn't that?
When I put System.out.println(m.group() + " : " + m.group(1));
first of all, if I get the condition as text text : text
, that is, duplicates are replaced with one word.
else { while (m.find()) { if (output == "") { System.out.println(m.group() + " : " + m.group(1)); output = input.replaceFirst(m.group(), m.group(1)); } else {
Hope you now have what is happening ... :)
Good luck !!! Hurrah!!!
source share