Regex accepts * without specifying it in the template

Having developed a JavaScript regular expression, we discovered some strange behavior.

For the following pattern: [\'-=]

The * character is accepted. ( ' , - , = also accepted, but this is expected.)

We can replace '=' with any character. If we change the character order of the patterns, it no longer works.

Has anyone got an idea about this?

+6
source share
5 answers

A β€œ-” in the middle of the template is the cause of your problem. The β€œ-” symbol is special internal groups of characters like this, and that means β€œall characters in between.” Thus, "'= = means" all characters from "to" = ". It happens that" * "is in this range.

To fix this, reorder the list of characters so that the "-" is at the end, or quote it with a backslash.

+10
source

because in this case - means a range. In the ASCII table * is between ' and = . Your pattenr will also match all other characters between ' and = (e.g. digits). You can find all the ASCII characters here .
If you want to combine ' = or - , you must exit the minus sign. Use this template: [\'\-=]

+5
source

the - char has special meaning in char sequences in regexp.

he creates a range

[\ '- =] means accept \ and all characters between' and =

to indicate - in a char sequence, you should put it at the end

[\ '= -] will do what they expect.

+4
source

I think this is because you need to avoid the "-", otherwise it is a range (for example, [AZ]).

+4
source

Symbol - used to indicate the range in the set, for example [az] . Your set matches any character from ' to = , that is, all characters '()*+,-./0123456789:;<= .

You need to avoid - to use it literally:

 [\'\-=] 
+4
source

Source: https://habr.com/ru/post/908514/


All Articles