Hi, I am creating something like webspider in C #. In my research, I ran into a problem: I need to determine if the link is internal or external, inbound or outbound. So I needed to create a function to work for me. So I came up with the following function, but I'm not sure if this is the best possible algorithm for this task. Therefore, I would like your opinions on this issue.
I assume that links without http: // or https: // before the link are internal and if I have the domain http://www.blahblah.com , then the link, such as a test, should be internal, despite the fact that it has http: // in front, but the link, for example http://www.somethingelse.com/?var1=http://www.blahblah.com/test , is an external check of only the first letters.
private Boolean checklinkifinternal(String link) { Boolean isinternal = false; if (link.IndexOf("http://") == 0 || link.IndexOf("https://") == 0) { //Then probably external if (link.IndexOf("http://" + UrlName) == 0 || link.IndexOf("https://" + UrlName) == 0 || link.IndexOf("http://www." + UrlName) == 0 || link.IndexOf("https://www." + UrlName) == 0) { isinternal = true; } } else { isinternal = true; } return isinternal; }
source share