Regular expression with empty group "()" returning strange results

This pushes the edge a bit, but I have the following situation with this regular expression - "()": when used to split a string into an array [], the results are somewhat strange to me. For example, this line of code:

string[] res = new Regex("()").Split("hi!"); 

sets res to an array of 9 (!) elements: ["," "," h ",", "i", ","! ",", ""]

I expect it to return these 5 elements: ["h", "," i ",", "!" ]. The reason I need this particular result is compatibility with another regexp library ...

My question is, can this behavior be related to some missing parameters of the regex object or to some kind of coding problem or similar ... Or is it determined in some way and, of course, is the correct way it should work? Also, is there a way to get it to return a second (expected) result?

+4
source share
2 answers

I have indicated the position where your regular expression will match with the | : "|h|i|!|"

Split returns an array whose elements are either between two adjacent matches, or between the beginning of the line and the first match, or between the last match and the end of the line. It returns them in the order in which they occurred in the row. This gives the following result: ["","h","i","!",""]

This explains 5 of the 9 elements of the array.

However, "if the parentheses are used in the Regex.Split expression, any resulting text is included in the resulting array of strings." (direct quote from msdn, here: http://msdn.microsoft.com/en-us/library/ze12yx1d.aspx )

In this case, the captured text is an empty string. Since we had 4 matches, this explains the remaining 4 elements in your result.

Thus, the full result: ["","","h","","i","","!","",""]

+3
source

I would say that nine elements are true, because the expression also matches before "h" and after "!".

To avoid matching at the beginning or end, you can add lookahead / behind to make sure there are more characters in the empty match: "(?<=.)()(?=.)"

+3
source

Source: https://habr.com/ru/post/1299711/


All Articles