Regex: how to eliminate URLs ending in .dtd

This is a regular expression of JavaScript.

regex = /(http:\/\/[^\s]*)/g;

text = "I have http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd and I like http://google.com a lot";

matches = text.match(regex);

console.log(matches);

I get both urls as a result. However, I want to remove all URLs ending in .dtd. How to do it?

Please note that I am saying that ending with .dtd should be removed. This means that a URL like http://a.dtd.google.com must pass.

+3
source share
1 answer

The best way to do this is to use a negative lookbehind (in languages ​​that support them):

/(?>http:\/\/[^\s]*)(?<!\.dtd)/g

?> , regex - URL-, , / , .

(<!\.dtd) - lookbehind, \.dtd (.. URL- .dtd).

, (, JavaScript), lookahead, :

/(http:\/\/(?![^\s]*\.dtd\b)[^\s]*)/g

http://, , , .dtd, , .

, http://www.regular-expressions.info/

+3

Source: https://habr.com/ru/post/1739298/


All Articles