Asp.net mvc excludes action from search engine crawl

Is there a way to exclude a Controller action from a search server? Is there an MVC verb (attribute) that can be added above the action name?

I want to exclude the following URL from search in the search engine

 Home/Secret?type=1 

But I want it to be searchable crawl

 Home/Search 
+6
source share
4 answers

I think you need to dynamically generate the robots.txt file.

You must create a RobotController to work with the robots.txt file.

Check link here

In connection with the above link, the question arose about how to allow the .txt extension to perform an action: fooobar.com/questions/375916 / ...

 public ActionResult Robots() { Response.ContentType = "text/plain"; //-- Here you should write a response with the list of //areas/controllers/action for search engines not to follow. return View(); } 

Add Robots.cshtml

Mark the route so that calling this file instead invokes the action above.

 routes.MapRoute("Robots.txt", "robots.txt", new { controller = "Home", action = "Robots" }); 

Here is the NoRobots attribute with code to get a list of areas / controllers / actions with the attribute. Sorry for interpreting the full namespace text. I would like someone to look at thoughts in order to better understand.

 public sealed class NoRobotsAttribute : System.Attribute { public static IEnumerable<MethodInfo> GetActions() { return Assembly.GetExecutingAssembly().GetTypes() .Where(t => (typeof(Controller).IsAssignableFrom(t))) .SelectMany( type => type.GetMethods(BindingFlags.Public | BindingFlags.Instance) .Where(a => a.ReturnType == typeof(ActionResult)) ); } public static IEnumerable<Type> GetControllers() { return Assembly.GetExecutingAssembly().GetTypes() .Where(t => (typeof(Controller).IsAssignableFrom(t))); } public static List<string> GetNoRobots() { var robotList = new List<string>(); foreach (var methodInfo in GetControllers().Where(w => w.DeclaringType != null)) { var robotAttributes = methodInfo .GetCustomAttributes(typeof(NoRobotsAttribute), false) .Cast<NoRobotsAttribute>(); foreach (var robotAttribute in robotAttributes) { //-- run through any custom attributes on the norobots attribute. None currently specified. } List<string> namespaceSplit = methodInfo.DeclaringType.FullName.Split('.').ToList(); var controllersIndex = namespaceSplit.IndexOf("Controllers"); var controller = (controllersIndex > -1 ? "/" + namespaceSplit[controllersIndex + 1] : ""); robotList.Add(controller); } foreach (var methodInfo in GetActions()) { var robotAttributes = methodInfo .GetCustomAttributes(typeof(NoRobotsAttribute), false) .Cast<NoRobotsAttribute>(); foreach (var robotAttribute in robotAttributes) { //-- run through any custom attributes on the norobots attribute. None currently specified. } List<string> namespaceSplit = methodInfo.DeclaringType.FullName.Split('.').ToList(); var areaIndex = namespaceSplit.IndexOf("Areas"); var area = (areaIndex > -1 ? "/" + namespaceSplit[areaIndex + 1] : ""); var controllersIndex = namespaceSplit.IndexOf("Controllers"); var controller = (controllersIndex > -1 ? "/" + namespaceSplit[controllersIndex + 1] : ""); var action = "/" + methodInfo.Name; robotList.Add(area + controller + action); } return robotList; } } 

Using:

 [NoRobots] //Can be applied at controller or action method level. public class HomeController : Controller { [NoRobots] public ActionResult Index() { ViewData["Message"] = "Welcome to ASP.NET MVC!"; List<string> x = NoRobotsAttribute.GetNoRobots(); //-- Just some test code that wrote the result to a webpage. return View(x); } } 

... and for areas.

 namespace MVC.Temp.Areas.MyArea.Controllers { using MVC.Temp.Models.Home; [NoRobots] public class SubController : Controller { [NoRobots] public ActionResult SomeAction() { return View(); } } } 

Therefore, keep in mind that the solution is dependent on namespaces and will welcome any improvements someone may offer.

Finally, you need to write the robot file correctly, including any header information and wildcard support.

+10
source

If it is publicly available and especially linked on the page, the robot can / will find it. You can use rel="nofollow" in links, use the noindex meta tag on the page itself, or use the robots.txt file to Disallow index pages. This will prevent indexing of all honest search engines (for example, Google, Bing, Yahoo) or by links, but will not prevent random bots from looking at pages.

However, the URL is publicly available. A person can visit him, and then a computer can. If you want it to not be available to the general public, you probably want to explore user authentication.

+2
source

Maybe you just need to change the routing. You can add the following route. It will change the address of Home/Secret?type=1 to Home/Search

 routes.MapRoute( name: "NewRoute", url: "{controller}/Search", defaults: new { controller = "Home", action = "Secret", type = UrlParameter.Optional } ); 

You can also hide the controller name:

 routes.MapRoute( name: "NewRoute", url: "LadyGaga/Search", defaults: new { controller = "Home", action = "Secret", type = UrlParameter.Optional } ); 
0
source

Do you want to hide it from search engines, or no one can visit this URL? Because anyone who requests your robots.txt will find the urls here.

Can't you just set up an authorization that allows only certain users to access these actions? When submitting HTTP 401, search engines will not index it.

0
source

Source: https://habr.com/ru/post/951512/


All Articles