Regular expression to extract part of a string

I have a line in the form

Foo "Foo" "Some Foo" "Some Foo and more" 

I need to extract the Foo value which is in quotation marks and can be enclosed in any number of alphanumeric and white space characters. So, for the examples above, I would like the result to be

 <NoMatch> Foo Foo Foo 

I tried to get this to work, and this is the pattern I have so far used using lookahead / lookbehind for quotes. This works for "Foo" but not for others.

 (?<=")Foo(?=") 

Further expanding this to

 (?<=")(?<=.*?)Foo(?=.*?)(?=") 

does not work.

Any help would be appreciated!

+5
source share
4 answers

If the quotation marks are correctly balanced and the quoted strings do not span multiple lines, you can simply look ahead in the line to check if an even number of quotes follows. If this is not the case, we know that we are inside the quoted string:

 Foo(?![^"\r\n]*(?:"[^"\r\n]*"[^"\r\n]*)*$) 

Explanation:

 Foo # Match Foo (?! # only if the following can't be matched here: [^"\r\n]* # Any number of characters except quotes or newlines (?: # followed by "[^"\r\n]* # (a quote and any number of non-quotes/newlines "[^"\r\n]* # twice) )* # any number of times. $ # End of the line ) # End of lookahead assertion 

Watch live regex101.com

+9
source

Look-around ( (?<=something) and (?=something) ) do not work with variable-length patterns, i.e. on .* . Try the following:

 (?<=")(.*?)(Foo)(.*?)(?=") 

and then use matching strings (depending on your language: $1,$2,... or \1,\2,... or members of some array or something like that).

+1
source

Try to do something with this type of template:

 "[^"]*?Foo[^"]*?" 
0
source

In Notepad ++

 search : ("[^"]*)Foo([^"]*") replace : $1Bar$2 
0
source

Source: https://habr.com/ru/post/1482542/


All Articles