The string is parsed by the JS regular expression engine as a sequence of characters and the locations between them. See the following diagram for hyphenated locations:
-The- -Number- -is- -(-1-2-3-)-(-2-3-4-)- ||| | ||Location between T and h, etc. ............. | |1st symbol | start -> end
All of these positions can be analyzed and matched with a regular expression.
Since /\W*/g is a regular expression matching all non-overlapping occurrences (due to the g modifier) of 0 or more (due to * quantifier) characters other than the word, all positions in front of the word characters correspond . Between T and h there is a place checked using a regular expression, and since there is no word "w980" ( h is the word char), an empty match is returned (like \W* can match an empty string).
So, you need to replace the beginning of the line and each non-word char with _ . The naive approach is to use .replace(/\W|^/g, '_') . However, there is a caveat: if a line starts with a character other than a word, _ will not be added at the beginning of the line:
console.log("Hi there.".replace(/\W|^/g, '_'));
Note that here \W comes first in alternation and “wins” when it matches at the beginning of the line: a space is matched, and then no start position is found at the next iteration of the match.
Now you might think that you can combine with /^|\W/g . Look here:
console.log("Hi there.".replace(/^|\W/g, '_'));
The second result, _ Hi_there_ shows how the JS regular expression engine handles zero-width matches during the replace operation: after a zero-width match is found (here this is the position at the beginning of the line), the replacement occurs, and the RegExp.lastIndex property RegExp.lastIndex increased, thus going to the position after the first character! This is why the first space is saved and no longer matches \W
The solution is to use a consumption pattern that will not allow a zero width match:
console.log("Hi there.".replace(/^(\W?)|\W/g, function($0,$1) { return $1 ? "__" : "_"; })); console.log(" Hi there.".replace(/^(\W?)|\W/g, function($0,$1) { return $1 ? "__" : "_"; }));
source share