Check if the link is internal or external

Hi, I am creating something like webspider in C #. In my research, I ran into a problem: I need to determine if the link is internal or external, inbound or outbound. So I needed to create a function to work for me. So I came up with the following function, but I'm not sure if this is the best possible algorithm for this task. Therefore, I would like your opinions on this issue.

I assume that links without http: // or https: // before the link are internal and if I have the domain http://www.blahblah.com , then the link, such as a test, should be internal, despite the fact that it has http: // in front, but the link, for example http://www.somethingelse.com/?var1=http://www.blahblah.com/test , is an external check of only the first letters.

private Boolean checklinkifinternal(String link) { Boolean isinternal = false; if (link.IndexOf("http://") == 0 || link.IndexOf("https://") == 0) { //Then probably external if (link.IndexOf("http://" + UrlName) == 0 || link.IndexOf("https://" + UrlName) == 0 || link.IndexOf("http://www." + UrlName) == 0 || link.IndexOf("https://www." + UrlName) == 0) { isinternal = true; } } else { isinternal = true; } return isinternal; } 
+6
source share
2 answers
 Uri.Compare(new Uri("google.de"), new Uri("Google.de"), UriComponents.Host, UriFormat.SafeUnescaped, StringComparison.CurrentCulture); 

this is what I would say from my head :)

+6
source

It depends. If you are in a URI-http, does the https URI link to an internal link even if the domain name is the same? (And vice versa). You will have to decide.

Also, your algorithm does not account for local file systems (using file: //).

+1
source

Source: https://habr.com/ru/post/905085/


All Articles