The crawler exposure rule determines the speed at which the Windows SharePoint Services Help Search service retrieves documents from a website during a scan. The speed can be defined either as the number of simultaneously requested documents, or as the delay between requests. In the absence of a seeker exposure rule, the number of documents requested is from 5 to 16, depending on hardware resources.
You can use the crawler’s exposure rules to modify the loads placed on sites when you crawl them.
Crawl rules provide the ability to set the behavior of the Enterprise Search index mechanism if you want to crawl content from a specific path. Using these rules, you can:
- Prevent crawling content in a specific path.
For example, in a scenario in which the content source points to a URL path, for example http://www.microsoft.com/ , but you want to block content from the "downloads" subdirectory http://www.microsoft.com/downloads/ , you must set a rule for the URL, with the behavior set to exclude content from this subdirectory.
- Specify that a specific path that would otherwise be excluded from the crawl should be scanned.
Using the previous scenario, if the download directory contains a directory called "content" that should be included in the crawl, you must create a crawl rule for the following URL, and the "content" subdirectory http://www.microsoft will be included as a behavior .com / downloads / content .
source share