Get capture group of all iterations

I am working on C # Regex.

Input text:

headera
aa1aaa
aa2aaa
aa3aaa

headerb
aa4aaa
aa5aaa
aa6aaa

headerc
aa7aaa
aa8aaa
aa9aaa

I would like to capture only numbers 4, 5 and 6, which are between headerb and headerc

My attempts:

I was able to capture them under the headers and headerb with the image below. I cannot apply the same concept on lookbehind, since it should be of zero width, so quantifiers are not allowed.

aa(\d+)aaa(?=[\s|\S]*headerc)

Repeating a capture group will only lead the last iteration. I cannot apply some ambiguous regex for multiple instances.

Help. Thanks

[SOLVED] Taking advantage of .Net, you can maintain the appearance of a variable width. You can use the templates below:

@"(?<=headerb[\s|\S]*)aa(\d)aaa(?=[\s\S]*headerc)"
@"(?s)(?<=\bheaderb\b.*?)\d+(?=.*?\bheaderc\b)"
@"(?<=\bheaderb\b(?:(?!\bheaderc\b)[\s\S])*)aa(\d+)aaa"
+4
source share
4 answers

# lookbehind. .

(?<=\bheaderb\b(?:(?!\bheaderc\b)[\s\S])*)aa(\d+)aaa

. .

+4

, , . aa(\d+)aaa(?=[\s|\S]*headerc) aa, 1 , ([\s\S] [\s|\S]), 0 , headerc. , .

, lookbehind .NET regex:

(?s)(?<=\bheaderb\b(?>(?!\bheader[bc]\b).)*)\d+

. . (?<=\bheaderb\b(?>(?!\bheader[bc]\b).)*) lookbehind , headerb headerc 0 ( Singleline, , . ). (?>(?!\bheader[bc]\b).)* , , headerc headerb. , headerc...headerd headerb....headerc (. regex).

( " " ). : "", LINQ, , :

var lines = s.Split(new[] { "\r", "\n"}, StringSplitOptions.RemoveEmptyEntries); // Split into line array
var subset = lines.SkipWhile(p => p != "headerb") // Get to the "headerb" line
                  .Skip(1)    // Get to the line after "headerb"
                  .TakeWhile(m => m != "headerc")  // Grab the lines in the block we need
                  .ToList();
var digits = Regex.Matches(string.Join(string.Empty, subset), "[0-9]+")
                 .Cast<Match>()
                 .Select(v => v.Value)
                 .ToList();

enter image description here

+2

regex , .

, - :

headerb\n(aa(\d+)aaa\n)+\nheaderc

. , /. (\ d +). cna , . (?<number>\d+) number.

: https://msdn.microsoft.com/en-us/library/bs2twtah(v=vs.110).aspx

+1

:

(?<=headerb)[\r\n]*(?:aa(?<number>\d+)aaa[\r\n]*)+(?=headerc)

Code example

It outputs exactly what you want (4, 5, 6).

var regex = new Regex(@"(?<=headerb)[\r\n]*(?:aa(?<number>\d+)aaa[\r\n]*)+(?=headerc)", RegexOptions.Singleline);
var match = regex.Match(<input>);
if (match.Success)
{
    foreach (var number in match.Groups["number"].Captures.Cast<Capture>())
    {
        Console.WriteLine(number);
    }
}
+1
source

Source: https://habr.com/ru/post/1618250/


All Articles