"g...">

Get the second level domain name at the URL

Is there any way to get top level domain name from url

e.g. " https://images.google.com/blah " => "google"

I found this:

var domain = new URL(pageUrl).hostname; 

but he gives me "images.google.com" instead of Google.

The unit tests that I have are:

 https://images.google.com => google https://www.google.com/blah => google https://www.google.co.uk/blah => google https://www.images.google.com/blah => google 
+5
source share
5 answers

You can do it:

 location.hostname.split('.').pop() 

EDIT

Look at the change in your question, you will need a list of all TLDs to map to and remove from the host name, then you can use split('.').pop()

 // small example list var re = new RegExp('\.+(co.uk|me|com|us)') var secondLevelDomain = 'https://www.google.co.uk'.replace(re, '').split('.').pop() 
+5
source

This is the easiest solution, in addition to saving lists of top-level domains in white and black.

  • Match to a top-level domain if it has two or more xxxx.yyy characters

  • Match on the top-level domain and subdomains if both are under the two characters "xxxxx.yy.zz"

  • Delete match.

  • Return everything between the last period and the end of the line.


I broke it into two rules OR | regex:

  • (\.[^\.]*)(\.*$) - the last period to the end of the line if the upper domain β†’ = 3.
  • (\.[^\.]{0,2})(\.[^\.]{0,2})(\.*$) - Top and subdomain: <= 2 ..

 var regex_var = new RegExp(/(\.[^\.]{0,2})(\.[^\.]{0,2})(\.*$)|(\.[^\.]*)(\.*$)/); var unit_test = 'xxx.yy.zz.'.replace(regex_var, '').split('.').pop(); document.write("Returned user entered domain: " + unit_test + "\n"); var result = location.hostname.replace(regex_var, '').split('.').pop(); document.write("Current Domain: " + result); 
+3
source

How about this?

location.hostname.split('.').reverse()[1]

+1
source

What you want to extract from the URL is not a top level domain (TLD). TLD is the rightmost part e.g. ..com.

Having said that, I don’t think there is an easy way to do this, because there are URLs that have two β€œcommon” parts, such as β€œ.co.uk”, and I suppose you don’t want to extract β€œ.co "in these cases. Perhaps you can use the list of existing two-part "TLDs" to verify that you know when to retrieve the part.

0
source
 function getDomainName( hostname ) { var TLDs = new RegExp(/\.(com|net|org|biz|ltd|plc|edu|mil|asn|adm|adv|arq|art|bio|cng|cnt|ecn|eng|esp|etc|eti|fot|fst|g12|ind|inf|jor|lel|med|nom|ntr|odo|ppg|pro|psc|psi|rec|slg|tmp|tur|vet|zlg|asso|presse|k12|gov|muni|ernet|res|store|firm|arts|info|mobi|maori|iwi|travel|asia|web|tel)(\.[az]{2,3})?$|(\.[^\.]{2,3})(\.[^\.]{2,3})$|(\.[^\.]{2})$/); return hostname.replace(TLDs, '').split('.').pop(); } /*** TEST ***/ var domains = [ 'domain.com', 'subdomain.domain.com', 'www.subdomain.domain.com', 'www.subdomain.domain.info', 'www.subdomain.domain.info.xx', 'mail.subdomain.domain.co.uk', 'mail.subdomain.domain.xxx.yy', 'mail.subdomain.domain.xx.yyy', 'mail.subdomain.domain.xx', 'domain.xx' ]; var result = []; for (var i = 0; i < domains.length; i++) { result.push( getDomainName( domains[i] ) ); } alert ( result.join(' | ') ); // result: domain | domain | domain | domain | domain | domain | domain | domain | domain | domain 
-1
source

Source: https://habr.com/ru/post/1203011/


All Articles