Regex expression to remove eed from string

I am trying to replace 'eed' and 'eedly' with 'ee' with words where there is a vowel before any term appears ('eed' or 'eedly') .

So, for example, the word indeed will become indee , because there is a vowel ('i') that occurs before 'eed'. On the other hand, the word 'feed' would not change because there is no vowel before the suffix 'eed' .

I have this regular expression: (?i)([aeiou]([aeiou])*[e{2}][d]|[dly]\\b) You can see what happens with this here .

As you can see, this correctly identifies words that end in 'eed' , but it does not correctly identify 'eedly' .

Also, when he replaces, he replaces all words ending in 'eed' , even words such as feed , which he should not delete eed

What should I consider here in order to correctly identify words based on the rules that I have indicated?

+5
source share
2 answers

You can use:

 str = str.replaceAll("(?i)\\b(\\w*?[aeiou]\\w*)eed(?:ly)?", "$1ee"); 

Updated RegEx Demo

\\b(\\w*?[aeiou]\\w*) before eed or eedly guarantees that before that there is at least one vowel in the same word.

To speed up this regex, you can use the regex redefined expression:

 \\b([^\\Waeiou]*[aeiou]\\w*)eed(?:ly)? 

RegEx Distribution:

 \\b # word boundary ( # start captured group #` [^\\Waeiou]* # match 0 or more of non-vowel and non-word characters [aeiou] # match one vowel \\w* # followed by 0 or more word characters ) # end captured group #` eed # followed by literal "eed" (?: # start non-capturing group ly # match literal "ly" )? # end non-capturing group, ? makes it optional 

Replacement:

 "$1ee" which means back reference to captured group #1 followed by "ee" 
+5
source

find dly before searching d. otherwise, your regular expression evaluation will stop after finding eed.

 (?i)([aeiou]([aeiou])*[e{2}](dly|d)) 
+1
source

Source: https://habr.com/ru/post/1243173/


All Articles