Get a domain with its subdomain from a URL

I use this function to get the domain and subdomain from a string. But if the string is already my expected format, it returns null

function getDomainFromUrl($url) { $host = parse_url($url, PHP_URL_HOST); return preg_replace('/^www\./', '', $host); } $url = "http://abc.example.com/" -> abc.example.com | OK $url = "http://www.example.com/" -> example.com | OK $url = "abc.example.com" -> FAILS! 
+6
source share
4 answers

The problem is that parse_url returns false. Make sure you get the answer before trying to use it, otherwise $host empty.

 <?php function getDomainFromUrl($url) { $host = (parse_url($url, PHP_URL_HOST) != '') ? parse_url($url, PHP_URL_HOST) : $url; return preg_replace('/^www\./', '', $host); } echo getDomainFromUrl("http://abc.example.com/") . "\n"; echo getDomainFromUrl("http://www.example.com/") . "\n"; echo getDomainFromUrl("abc.example.com"); 

Output:

abc.example.com
example.com
abc.example.com

0
source

This is because abc.example.com not PHP_URL_HOST , so you need to check first that it is first. Therefore, you should do something simple, for example, if there is no protocol in the URL → add it:

 function addhttp($url) { if (!preg_match("~^(?:f|ht)tps?://~i", $url)) { $url = "http://" . $url; } return $url; } function getDomainFromUrl($url) { $host = parse_url($url, PHP_URL_HOST); if($host){ return preg_replace('/^www\./', '', $host); }else{ //not a url with protocol $url = addhttp($url); //add protocol return getDomainFromUrl($url); //run function again. } } 
+3
source

Here's the pure regex:

 function getDomainFromUrl($url) { if (preg_match('/^(?:https?:\/\/)?(?:(?:[^@]*@)|(?:[^:]*:[^@]*@))?(?:www\.)?([^\/:]+)/', $url, $parts)) { return $parts[1]; } return false; // or maybe '', depending on what you need } getDomainFromUrl("http://abc.example.com/"); // abc.example.com getDomainFromUrl("http://www.example.com/"); // example.com getDomainFromUrl("abc.example.com"); // abc.example.com getDomainFromUrl(" username@abc.example.com "); // abc.example.com getDomainFromUrl("https://username: password@abc.example.com "); // abc.example.com getDomainFromUrl("https://username: password@abc.example.com :123"); // abc.example.com 

You can try it here: http://sandbox.onlinephpfunctions.com/code/3f0343bbb68b190bffff5d568470681c00b0c45c

If you want to know more about regex:

 ^ matching must start from the beginning on the string (?:https?:\/\/)? an optional, non-capturing group that matches http:// and https:// (?:(?:[^@]*@)|(?:[^:]*:[^@]*@))? an optional, non-capturing group that matches either *@ or *:*@ where * is any character (?:www\.)? an optional, non-capturing group that matches www. ([^\/:]+) a capturing group that matches anything up until a '/', a ':', or the end of the string 
+3
source

parse_url () does not work with relative URLs. You can check if sheme is present and if you do not add a default value:

 if ( !preg_match( '/^([^\:]+)\:\/\//', $url ) ) $url = 'http://' . $url; 
0
source

Source: https://habr.com/ru/post/987158/


All Articles