.NET Regular Expression Anchors

Question

.NET Regular Expression Anchors

This is a question about the answer asked in the question Check the line to see if all characters are hexadecimal values .

The suggested regular expression is as follows:

\A\b[0-9a-fA-F]+\b\Z

Now \A and \Z seem equivalent to ^ and $ respectively. \Z behaves differently because it allows you to use a new line after it (it may or may not be intended).

I don’t understand why the anchor \b "match at word boundary" is used. Isn't the beginning / end of a line always a word boundary?

Ultimately, the regular expression can be rewritten as ^[0-9a-fA-F]$ with the same behavior (without taking into account the problem of returning \n ). Am I missing something? Is \b for some weird edge?

Test cases:

 123ABC -> true 123def -> Returns true 123g -> Returns false

+4

regex .net

knittl Apr 15 '16 at 6:27

source share

1 answer

Wiktor Stribiżew · Answer 1 · 2016-04-15 07:12

the boundary of the word \b corresponds to characters without a word and a word, and also at the beginning of a line if the first character is a word character, and at the end if the last character is a word character.

Thus, \A\b[0-9a-fA-F]+\b\Z is equal to \A[0-9a-fA-F]+\Z , because all characters in the string must be characters of the word ( [0-9] digits or [a-fA-F] letters) to match the pattern.

In this case, there will be another story: \A\b[0-9a-fA-F-]+\b\Z , which will correspond only to lines with the word characters at the beginning and end.

Use \z to match the whole string, without permission \n at the end.

.NET Regular Expression Anchors

More articles: