Javascript regex that gets all subdomains

Question

Javascript regex that gets all subdomains

I have the following RegEx:

[!?\.](.*)\.example\.com

and this line:

 test foo abc.def.example.com bar ghi.jkl.example.com def

I want RegEx products to have the following matches: def.example.com and jkl.example.com . What do i need to change? It should work in all subdomains of example.com. If possible, it should only accept the first level of the subdomain ( abc.def.example.com → def.example.com ).

Tested on regexpal , not fully working :(

+6

javascript regex

fnkr Jul 15 '13 at 14:34

source share

2 answers

You can use the following expression: [^.\s]+\.example\.com .

Explanation

[^.\s]+ : match anything but a period or space one or more times
\.example\.com : match example.com

Note that you do not need to avoid dots in the character class

+8

Hamza Jul 15 '13 at 14:38

source share

talemyn · Accepted Answer · 2013-07-15T15:57:00+0000

Just on the side of the note, while HamZa's answer works for your current code sample, if you need to make sure that the domain names are also valid, you can try a different approach, since [^.\s]+ will match > ANY , which is not a space or . (for example, this regular expression would match jk&^%&*(l.example.com as a "valid" subdomain).

Since for domain name values there are much fewer allowed characters than invalid characters, you can use the "additive" approach to the regular expression, rather than subtractive. This template here is probably the one you are looking for valid domain names: /(?:[\s.])([a-z0-9][a-z0-9-]+[a-z0-9]\.example\.com)/gi

Break it a little more.,.

(?:[\s.]) - matches a space or . , which would mean the start of a low level subdomain
([a-z0-9][a-z0-9-]+[a-z0-9]\.example\.com) - this captures a group of letters, numbers or dashes, which must begin and end with a letter or number ( domain name rules) and then the domain example.com .
gi - makes the regular expression pattern greedy and case insensitive.

At the moment, it's just a matter of capturing matches. Since .match() does not work well with non-reactive groups without repeating, use .exec() instead:

 var domainString = "test foo abc.def.example.com bar ghi.jkl.example.com def"; var regDomainPattern = /(?:[\s.])([a-z0-9][a-z0-9-]+[a-z0-9]\.example\.com)/gi; var aMatchedDomainStrings = []; var patternMatch; // loop through as long as .exec() still gets a match, and take the second index of the result (the one that ignores the non-capturing groups) while (null != (patternMatch = regDomainPattern.exec(domainString))) { aMatchedDomainStrings.push(patternMatch[1]); }

At this point, aMatchedDomainStrings should contain all of your actual first level subdomains.

 var domainString = "test foo abc.def.example.com bar ghi.jkl.example.com def";

., should get you: def.example.com and jkl.example.com , and:

 var domainString = "test foo abc.def.example.com bar ghi.jk&^%&*(l.example.com def";

. should only get you: def.example.com

Javascript regex that gets all subdomains

More articles: