Regular expression for String.Format-like utility

I am writing a class called StringTemplate that allows you to format objects such as String.Format , but with names instead of indexes for placeholders. Here is an example:

 string s = StringTemplate.Format("Hello {Name}. Today is {Date:D}, and it is {Date:T}.", new { Name = "World", Date = DateTime.Now }); 

To achieve this result, I look for placeholders and replace them with indexes. Then I pass the resulting string format String.Format .

This works fine except when there are double curly braces that are escape sequences. The desired behavior (the same as String.Format ) is described below:

  • "Hello {Name}" should be formatted as "Hello World"
  • Hello {{Name}} should be formatted as Hello {Name}
  • "Hello {{{Name}}}" must be formatted as "Hello {World}"
  • Hello {{{{Name}}}} should be formatted as Hello {{Name}}

And so on...

But my current regex does not detect the escape sequence and always takes the substring between the brackets as a placeholder, so I get things like "Hello {0}".

Here is my current regex:

 private static Regex _regex = new Regex(@"{(?<key>\w+)(?<format>:[^}]+)?}", RegexOptions.Compiled); 

How can I change this regex to ignore escaped curly braces? What seems very difficult is that I have to detect placeholders depending on whether the number of brackets is odd or even ... I can't think of a simple way to do this with a regular expression, is this possible?




For completeness, here is the full code of the StringTemplate class:

 public class StringTemplate { private string _template; private static Regex _regex = new Regex(@"{(?<key>\w+)(?<format>:[^}]+)?}", RegexOptions.Compiled); public StringTemplate(string template) { if (template == null) throw new ArgumentNullException("template"); this._template = template; } public static implicit operator StringTemplate(string s) { return new StringTemplate(s); } public override string ToString() { return _template; } public string Format(IDictionary<string, object> values) { if (values == null) { throw new ArgumentNullException("values"); } Dictionary<string, int> indexes = new Dictionary<string, int>(); object[] array = new object[values.Count]; int i = 0; foreach (string key in values.Keys) { array[i] = values[key]; indexes.Add(key, i++); } MatchEvaluator evaluator = (m) => { if (m.Success) { string key = m.Groups["key"].Value; string format = m.Groups["format"].Value; int index = -1; if (indexes.TryGetValue(key, out index)) { return string.Format("{{{0}{1}}}", index, format); } } return string.Format("{{{0}}}", m.Value); }; string templateWithIndexes = _regex.Replace(_template, evaluator); return string.Format(templateWithIndexes, array); } private static IDictionary<string, object> MakeDictionary(object obj) { Dictionary<string, object> dict = new Dictionary<string, object>(); foreach (var prop in obj.GetType().GetProperties()) { dict.Add(prop.Name, prop.GetValue(obj, null)); } return dict; } public string Format(object values) { return Format(MakeDictionary(values)); } public static string Format(string template, IDictionary<string, object> values) { return new StringTemplate(template).Format(values); } public static string Format(string template, object values) { return new StringTemplate(template).Format(values); } } 
+2
c # regex formatting templates
Sep 18 '09 at 16:19
source share
4 answers

You can use regex to match a balanced pair, and then figure out what to do with curly braces. Remember that .NET regular expressions are not "regular".

 class Program { static void Main(string[] args) { var d = new Dictionary<string, string> { { "Name", "World" } }; var t = new Test(); Console.WriteLine(t.Replace("Hello {Name}", d)); Console.WriteLine(t.Replace("Hello {{Name}}", d)); Console.WriteLine(t.Replace("Hello {{{Name}}}", d)); Console.WriteLine(t.Replace("Hello {{{{Name}}}}", d)); Console.ReadKey(); } } class Test { private Regex MatchNested = new Regex( @"\{ (?> ([^{}]+) | \{ (?<D>) | \} (?<-D>) )* (?(D)(?!)) \}", RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled | RegexOptions.Singleline); public string Replace(string input, Dictionary<string, string> vars) { Matcher matcher = new Matcher(vars); return MatchNested.Replace(input, matcher.Replace); } private class Matcher { private Dictionary<string, string> Vars; public Matcher(Dictionary<string, string> vars) { Vars = vars; } public string Replace(Match m) { string name = m.Groups[1].Value; int length = (m.Groups[0].Length - name.Length) / 2; string inner = (length % 2) == 0 ? name : Vars[name]; return MakeString(inner, length / 2); } private string MakeString(string inner, int braceCount) { StringBuilder sb = new StringBuilder(inner.Length + (braceCount * 2)); sb.Append('{', braceCount); sb.Append(inner); sb.Append('}', braceCount); return sb.ToString(); } } } 
+1
Sep 18 '09 at 20:20
source share

This may be possible with regular expressions - but I'm not at all convinced that this will be the easiest support solution. Given that you are only interested in braces and colons (I think), I would personally avoid using regular expressions.

I would build a sequence of tokens, each of which is either a letter or a format string. Build it by simply going through the chain and noticing the opening and closing curly braces. Then, evaluating the sequence is simply a matter of combining the markers, formatting each one when necessary.

Again, I have never been a big fan of regular expressions - sometimes they are great, but at the same time they feel superfluous. Maybe there is some smart way to get them to do what you want in this case ...

Btw, you will need to determine what you want in cases where curly braces do not match properly, for example.

 {{Name} foo 
+3
Sep 18 '09 at 16:28
source share

Parity is usually very easy to solve with regular expressions. For example, this is an expression that matches any string with an even number A s but not an odd number:

 (AA)* 

So all you have to do is find an expression that matches only the odd number { and } s.

 {({{)* }(}})* 

(avoiding characters, despite). So adding this idea to you in the current expression will give something like

 {({{)*(?<key>\w+)(?<format>:[^}]+)?}(}})* 

However, this does not correspond to the power of the brackets on both sides. In other words, {{{ will match } because they are both odd. Regular expressions cannot count things, so you won’t be able to find an expression that matches the power you want.

Indeed, you need to parse the strings using a custom analyzer that reads the string and counts instances of { but not instances of {{ to match them with instances } but not }} on the other hand. I think you will understand that the way String formatters in .NET work behind the scenes because regular expressions are not suitable for parsing nested structures of any type.

Or you can use both ideas in agreement: match potential tokens with a regular expression, and then check the balance of the braces using a quick check of the result. This is likely to end up confusing and indirectly. Usually you better write your own parser for such a scenario.

+3
Sep 18 '09 at 16:29
source share

In the end, I used a technique similar to what Gavin suggested.

I changed the regular expression so that it matches all the brackets around the placeholder:

 private static Regex _regex = new Regex(@"(?<open>{+)(?<key>\w+)(?<format>:[^}]+)?(?<close>}+)", RegexOptions.Compiled); 

And I changed the MatchEvaluator logic so that it handled the shielded braces correctly:

  MatchEvaluator evaluator = (m) => { if (m.Success) { string open = m.Groups["open"].Value; string close = m.Groups["close"].Value; string key = m.Groups["key"].Value; string format = m.Groups["format"].Value; if (open.Length % 2 == 0) return m.Value; open = RemoveLastChar(open); close = RemoveLastChar(close); int index = -1; if (indexes.TryGetValue(key, out index)) { return string.Format("{0}{{{1}{2}}}{3}", open, index, format, close); } else { return string.Format("{0}{{{{{1}}}{2}}}{3}", open, key, format, close); } } return m.Value; }; 

I rely on String.Format to String.Format if necessary. I did some unit tests, and so far everything is working fine ...

Thank you all for your help!

0
Sep 18 '09 at 22:57
source share



All Articles