What I want to do is to parse some custom tags from a string, as well as get unlabeled content. For example, I have the following line
Hello World <Red>This is some red text </Red> This is normal <Blue>This is blue text </Blue>
I have a working regex to retrieve tagged content with
<(?<tag>\w*)>(?<text>.*)</\k<tag>>
However this returns
tag: Red
text: This is some red text
tag: Blue
text this is blue text
I also need to get matches for untagged content, so I would get 4 matches, two as above, as well as "Hello World" and "This is OK."
Is this possible with regex?
For example, this is my current function:
public static List<FormattedConsole> FormatColour(string input)
{
List<FormattedConsole> formatted = new List<FormattedConsole>();
Regex regex = new Regex("<(?<Tag>\\w+)>(?<Text>.*?)</\\1>", RegexOptions.IgnoreCase
| RegexOptions.CultureInvariant
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
MatchCollection ms = regex.Matches(input);
foreach (Match match in ms)
{
GroupCollection groups = match.Groups;
FormattedConsole format = new FormattedConsole(groups["Text"].Value, groups["Tag"].Value);
formatted.Add(format);
}
return formatted;
}
As already mentioned, this returns matches between tags. I also need to get text without tags.
(btw FormattedConsole is just a container containing text and color)