Regex pins inside a character class

Can I use anchors inside a character class? This does not work:

analyze-string('abcd', '[\s^]abcd[\s$]') 

It seems that ^ and $ treated as literals inside the character class; however, escaping them ( \^ , \$ ) does not work either.

I am trying to use this expression to create word boundaries ( \b not available in XSLT / XQuery), but I would prefer not to use groups ( (^|\s) ) - since non-capturing groups aren 't, which means that in some In scenarios, I can get a large number of unnecessary capture groups, and this creates a new task of finding β€œreal” capture groups in a set of unnecessary.

+6
source share
3 answers

I believe the answer is no, you cannot include ^ and $ as anchors in [] , only as literals. (I also wanted you to do this).

However, you can concatenate the space at the front and back of the line, and then just look for \s as word boundaries and ignore the anchors. For instance.

 analyze-string(concat(' ', 'abcd xyz abcd', ' '), '\sabcd\s') 

You may also want + after each \s , but this is a separate issue.

+4
source

If you use analyze-string as a function, then you are supposedly using the XSLT or XQuery 3.0 implementation.

In this case, why are you saying that "groups not participating in the capture are not available"? The XPath Functions and Operators 3.0 spec makes it clear that "capture groups are also not recognized. They are indicated by the syntax (?: Xxxx)."

+2
source

Using a carriage after the first square bracket will negate the character class. This essentially gives you the opposite of what you want to do, that is, the character class will match any character that is not in the character class. Negative character classes also correspond to (invisible) line break characters.

Perhaps you will try to make a negative forecast ahead.

 (?!\s) 
0
source

Source: https://habr.com/ru/post/946129/


All Articles