RegEx Extract Sentence with a matching word, not stopping at "Mr.", "Mrs." etc

I created a regular expression that can retrieve sentences containing a matching word.

[^.|?|!]*\<friends\>[^.|!|?]*[\"!?:\.] 

But this does not apply when Mr./Mrs./Dr is present in the sentence. etc.

For instance:

 The adventures are great. I don't know whether you know that Dr. Watson and Mr. Holmes are good friends, Ms. Adler. 

My desired result:

 I don't know whether you know that Dr. Watson and Mr. Holmes are good friends, Ms. Adler. 

How to do it?

+5
source share
5 answers

Through a negative result.

 (?:(?!Mr|Ms|Dr|[.?!]).|Mr\.|Ms\.|Dr\.)*\bfriends\b(?:(?!Mr|Ms|Dr|[.?!]).|Mr\.|Ms\.|Dr\.)*[\"!?:.] 

Demo

+2
source

You can use something like this: (?:(Dr|Mr|Ms)\.|[^.])+ And return results only if group 1 has a match.

+1
source
 \.((([^.]*Mr\.)|([^.]*Dr\.)|([^.]*Ms\.))*[^.]*)(?<=friends) 

This should work, you change the word "friends" to what you want to find in the sentence, and you can add additional false positive matches by simply tying them right after | ([^.] * Ms.) in the same style, so if you also wanted to ignore M., you would add | ([^.] * M.) and then the regular expression will look like this:

 \.((([^.]*Mr\.)|([^.]*Dr\.)|([^.]*Ms\.)|([^.]*M\.))*[^.]*)(?<=friends) 

Updated solution, its a little awkward now tho :), its saved in capture group 0

 \.(((([^.]*Mr\.)|([^.]*Dr\.)|([^.]*Ms\.)|([^.]*M\.))*[^.]*)(?<=friends)((([^.]*Mr\.)|([^.]*Dr\.)|([^.]*Ms\.)|([^.]*M\.))*[^.!?]*)) 
+1
source

If the language you use supports the taste of PCRE, this may be the first solution:

((?:[^.?!]|(?<=Mr|Mrs|Ms|Dr)\.)*)friends(?1)

Demonstration and explanation of regex101

+1
source

You can use this awful looking regex:

 /[az](?:(?:(?:drs?|m[rs])\.)|[^.|?|!])*friends(?:(?:(?:drs?|m[rs])\.)|[^.|?|!])*[\"!?:\.]/i 

You can replace the word friends with what you want to combine.

Please note that it MUST NOT be if friends is the first word.

You can use this one that matches if friends is the first word:

 /(?:friends|[az])?(?:(?:(?:drs?|m[rs])\.)|[^.|?|!])*friends(?:(?:(?:drs?|m[rs])\.)|[^.|?|!])*[\"!?:\.]/i 

This will match the space immediately before the sentence.

If this is a problem, you can use this:

 /\s*((?:friends|[az])?(?:(?:(?:drs?|m[rs])\.)|[^.|?|!])*friends(?:(?:(?:drs?|m[rs])\.)|[^.|?|!])*[\"!?:\.])/i 

This will save the entire offer for $1 and will work if friends is the first offer.

All have been tested using Javascript and should work for other tastes.

+1
source

Source: https://habr.com/ru/post/1204633/


All Articles