Get expression in named capture

I provide a text box for entering a regular expression to match file names. I plan to detect any named capture groups that they provide using the Regex GetGroupNames() method.

I want to get the expression that they entered into each named capture group.

As an example, they can introduce a regex like this:

 December (?<FileYear>\d{4}) Records\.xlsx 

Is there a way or means to get the subexpression \d{4} besides manually parsing the regular expression string?

+5
source share
3 answers

Here is an ugly brute force extension for parsing without using another Regex to detect a subexpression (or subpattern):

  public static string GetSubExpression(this Regex pRegex, string pCaptureName) { string sRegex = pRegex.ToString(); string sGroupText = @"(?<" + pCaptureName + ">"; int iStartSearchAt = sRegex.IndexOf(sGroupText) + sGroupText.Length; string sRemainder = sRegex.Substring(iStartSearchAt); string sThis; string sPrev = ""; int iOpenParenCount = 0; int iEnd = 0; for (int i = 0; i < sRemainder.Length; i++) { sThis = sRemainder.Substring(i, 1); if (sThis == ")" && sPrev != @"\" && iOpenParenCount == 0) { iEnd = i; break; } else if (sThis == ")" && sPrev != @"\") { iOpenParenCount--; } else if (sThis == "(" && sPrev != @"\") { iOpenParenCount++; } sPrev = sThis; } return sRemainder.Substring(0, iEnd); } 

Usage is as follows:

  Regex reFromUser = new Regex(txtFromUser.Text); string[] asGroupNames = reFromUser.GetGroupNames(); int iItsInt; foreach (string sGroupName in asGroupNames) { if (!Int32.TryParse(sGroupName, out iItsInt)) //don't want numbered groups { string sSubExpression = reParts.GetSubExpression(sGroupName); //Do what I need to do with the sub-expression } } 

Now, if you want to generate test data or sample data, you can use the NuGet package called "Fare" as follows after receiving the subexpression:

  //Generate test data for it Fare.Xeger X = new Fare.Xeger(sSubExpression); string sSample = X.Generate(); 
+1
source

Here is a solution using regex to match capture groups in regex. The idea from this post is using RegEx to match parentheses in parentheses :

 \(\?\<(?<MyGroupName>\w+)\> (?<MyExpression> ((?<BR>\()|(?<-BR>\))|[^()]*)+ ) \) 

or more succinctly ...

 \(\?\<(?<MyGroupName>\w+)\>(?<MyExpression>((?<BR>\()|(?<-BR>\))|[^()]*)+)\) 

and using it might look like this:

 string sGetCaptures = @"\(\?\<(?<MyGroupName>\w+)\>(?<MyExpression>((?<BR>\()|(?<-BR>\))|[^()]*)+)\)"; MatchCollection MC = Regex.Matches(txtFromUser.Text, sGetCaptures ); foreach (Match M in MC) { string sGroupName = M.Groups["MyGroupName"].Value; string sSubExpression = M.Groups["MyExpression"].Value; //Do what I need to do with the sub-expression MessageBox.Show(sGroupName + ":" + sSubExpression); } 

And for the example in the original question, the message box will return FileYear:\d{4}

0
source

This template (?<=\(\?<\w+\>)([^)]+) will provide you with all the expressions named match match with the capture name. It uses a negative appearance to make sure that the text matched will have (?<...> before it.


 string data = @"December (?<FileYear>\d{4}) Records\.xlsx"; string pattern = @"(?<=\(\?<\w+\>)([^)]+)"; Regex.Matches(data, pattern) .OfType<Match>() .Select(mt => mt.Groups[0].Value) 

returns one element

\d{4}

So far, data such as (?<FileMonth>[^\s]+)\s+(?<FileYear>\d{4}) Records\.xlsx will return two matches:

[^\s]+

\d{4}

0
source

Source: https://habr.com/ru/post/1266564/


All Articles