A Regex Word macro that finds two words within each other and then italics those words?

So, I'm just starting to understand regular expressions, and I found the learning curve pretty steep. However, stackoverflow was very helpful in my experimentation. There is a specific macro of words that I would like to write, but I did not understand the way to do this. I would like to be able to find two words within 10 or so words of each other in a document, and then italicize these words if the words are separated by more than 10 words or are in a different order. I would like the macro to not be italicized with these words.

I use the following regular expression:

\bPanama\W+(?:\w+\W+){0,10}?Canal\b 

However, it allows me to manipulate the entire line as a whole, including random words between them. Also .Replace function allows me to replace this line with another line without changing formatting styles.

Does any more experienced person have an idea on how to do this job? Is it possible to do this?


EDIT: That's what I still have. I have two problems. At first I donโ€™t know how to choose the words โ€œPanamaโ€ and โ€œChannelโ€ from the agreed regular expression and replace only those words (not intermediate words). Secondly, I just donโ€™t know how to replace Regexp, which maps to a different format, only with a different line of text - perhaps simply because of ignorance of the word macro.

 Sub RegText() Dim re As regExp Dim para As Paragraph Dim rng As Range Set re = New regExp re.Pattern = "\bPanama\W+(?:\w+\W+){0,10}?Canal\b" re.IgnoreCase = True re.Global = True For Each para In ActiveDocument.Paragraphs Set rng = para.Range rng.MoveEnd unit:=wdCharacter, Count:=-1 Text$ = rng.Text + "Modified" rng.Text = re.Replace(rng.Text, Text$) Next para End Sub 

Well, thanks to the help of Tim Williams below, I got the following solution together, it is somewhat more awkward in some respects, and it is by no means a pure regular expression, but it does its job. If anyone has a better solution or idea on how to do this, I would be passionate about hearing this. Again, my rudely pushing change with the search and replace function is a bit awkwardly rude, but at least it works ...

 Sub RegText() Dim re As regExp Dim para As Paragraph Dim rng As Range Dim txt As String Dim allmatches As MatchCollection, m As match Set re = New regExp re.pattern = "\bPanama\W+(?:\w+\W+){0,13}?Canal\b" re.IgnoreCase = True re.Global = True For Each para In ActiveDocument.Paragraphs txt = para.Range.Text 'any match? If re.Test(txt) Then 'get all matches Set allmatches = re.Execute(txt) 'look at each match and hilight corresponding range For Each m In allmatches Debug.Print m.Value, m.FirstIndex, m.Length Set rng = para.Range rng.Collapse wdCollapseStart rng.MoveStart wdCharacter, m.FirstIndex rng.MoveEnd wdCharacter, m.Length rng.Font.ColorIndex = wdOrange Next m End If Next para Selection.Find.ClearFormatting Selection.Find.Font.ColorIndex = wdOrange Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Font.Italic = True With Selection.Find .Text = "Panama" .Replacement.Text = "Panama" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Replace:=wdReplaceAll Selection.Find.ClearFormatting Selection.Find.Font.ColorIndex = wdOrange Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Font.Italic = True With Selection.Find .Text = "Canal" .Replacement.Text = "Canal" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Replace:=wdReplaceAll Selection.Find.ClearFormatting Selection.Find.Font.ColorIndex = wdOrange Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Font.ColorIndex = wdBlack With Selection.Find .Text = "" .Replacement.Text = "" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With Selection.Find.Execute Replace:=wdReplaceAll End Sub 
+6
source share
2 answers

I am far from being a worthy Word programmer, but it can get you started.

EDIT: Updated to include a parameterized version.

 Sub Tester() HighlightIfClose ActiveDocument, "panama", "canal", wdBrightGreen HighlightIfClose ActiveDocument, "red", "socks", wdRed End Sub Sub HighlightIfClose(doc As Document, word1 As String, _ word2 As String, clrIndex As WdColorIndex) Dim re As RegExp Dim para As Paragraph Dim rng As Range Dim txt As String Dim allmatches As MatchCollection, m As match Set re = New RegExp re.Pattern = "\b" & word1 & "\W+(?:\w+\W+){0,10}?" _ & word2 & "\b" re.IgnoreCase = True re.Global = True For Each para In ActiveDocument.Paragraphs txt = para.Range.Text 'any match? If re.Test(txt) Then 'get all matches Set allmatches = re.Execute(txt) 'look at each match and hilight corresponding range For Each m In allmatches Debug.Print m.Value, m.FirstIndex, m.Length Set rng = para.Range rng.Collapse wdCollapseStart rng.MoveStart wdCharacter, m.FirstIndex rng.MoveEnd wdCharacter, Len(word1) rng.HighlightColorIndex = clrIndex Set rng = para.Range rng.Collapse wdCollapseStart rng.MoveStart wdCharacter, m.FirstIndex + (m.Length - Len(word2)) rng.MoveEnd wdCharacter, Len(word2) rng.HighlightColorIndex = clrIndex Next m End If Next para End Sub 
+6
source

If you only do every 2 words at a time, it worked for me, following your practical exercises.

 foo([a-zA-Z0-9]+? ){0,10}bar 

Explanation: it takes the word 1 ( foo ), then matches everything that is a word of alphanumeric characters ( [a-zA-Z0-9]+? ) [a-zA-Z0-9]+? by a space ( ), 10 times ( {0,10} ), then the word 2 ( bar ).

This one doesn't include full stops (didn't know if you want them), but if you just want to add . after 0-9 in regex.

So your syntax (pseudo-code) will look like:

 $matches = preg_match_all(); // Your function to get regex matches in an array foreach (those matches) { replace(KEY_WORD, <i>KEY_WORD</i>); } 

Hope this helps. The testing below highlighted that it matched.


Worked:

foo this that bar blah

foo economic order war bar

Does not work

The economic order. war bar

The global order of foo has existed for several centuries, during this period of time people have developed different and complex trade relations related to situations such as agriculture and the bar.

0
source

Source: https://habr.com/ru/post/919812/


All Articles