There are two problems with your approach.
- Your first look should be lookbehind. When you write
(?!cat) , the engine checks that the next three characters are cat , and then is reset to where it was run (that it is looking forward), and then you try to match dog with the same three characters. Therefore lookahead does not add anything: if you can match dog , you obviously cannot match cat in the same position. What you want is a lookbehind (?<!cat) , which checks that the previous characters are not cat . Unfortunately, JavaScript does not support lookbehind. - You want it logically or twice. In your case, if any search fails, the template will fail. Therefore, both requirements (not having
cat at both ends) must be met. But you really want OR this. If lookbehind were supported, which would look like (?<!cat)dog|dog(?!cat) (note that striping separates the entire pattern separately). But, as I said, lookbehind is not supported. The reason you seem to have * OR * ed two distortions in your first catdogdog bit is because the previous cat just wasn't checked (see Step 1).
How to work around lookbehinds? Kolink's answer suggests (?!cat)...dog , which places a back link at the position where cat will run, and uses lookahead. This has two new problems: it cannot match dog at the beginning of the line (because it requires three characters in front. And it cannot match two consecutive dog because matches cannot intersect (after matching the first dog , the engine requires three new characters, which ... which will be consumed by the next dog before dog is matched again).
Sometimes you can get around it by wrapping both patterns and a string, and thus turn lookbehind into lookahead, but in your case it will turn lookahead to lookbehind at the end.
Regression solution
We need to be a little smarter. Since matches cannot overlap, we can try to explicitly match catdogcat without replacing it (hence, skipping them in the target line), and then just replace all the dog that we find. We put these two cases in rotation, so both of them are checked at each position in the line (with the catdogcat option with priority, although this does not matter here). The problem is how to get conditional replacement strings. But let's see what we have:
text.replace(/(catdog)(?=cat)|dog/g, "$1[or 000 if $1 didn't match]")
So, in the first version, we map catdog and write it to group 1 and check that there is another cat . In the replacement string, we simply write $1 back. The beauty is that if the second alternative matches, the first group will not be used and, therefore, will be an empty replacement. The reason we match only catdog , and use lookahead instead of matching catdogcat right away, duplicates matches. If we used catdogcat , then at the input of catdogcatdogcat first match will consume everything before and including the second cat , so the second dog cannot be recognized by the first alternative.
Now the only question is how do we get 000 in exchange if we used the second alternative.
Unfortunately, we cannot invoke conditional replacements that are not part of the input string. The trick is to add 000 to the end of the input line, grab it in lookahead if we find dog , and then write this back:
text.replace(/$/, "000") .replace(/(catdog)(?=cat)|dog(?=.*(000))/g, "$1$2") .replace(/000$/, "")
The first substitution adds 000 to the end of the line.
The second substitution matches either catdog (checking that the next cat follows) and commits it to group 1 (leaving 2 empty) or matches dog and captures 000 to group 2 (leaving group 1 empty). Then we write $1$2 back, which will be either unadorned catdog or 000 .
The third replacement saves us from our extraneous 000 at the end of the line.
Callback solution
If you are not a fan of preparing a regular expression and a second look at the second option, instead you can use a slightly simpler regular expression with a callback replacement:
text.replace(/(catdog)(?=cat)|dog/g, function(match, firstGroup) { return firstGroup ? firstGroup : "000" })
With the replace version, the called function is called for each match, and its return value is used as the replacement string. The first parameter of the function is a complete match, the second parameter is the first capture group (which will be undefined if the group does not participate in the match) and so on ...
Thus, in the replacement callback, we can call our 000 if firstGroup is undefined (i.e. the dog variant) or just return firstGroup if it is present (i.e. catdogcat ). This is a little more concise and perhaps easier to understand. However, the overhead of calling a function makes it much slower (although it depends on how often you want to do this). Choose your favorite!