Block all bots / crawlers / spiders for a special directory with htaccess

I try to block all bots / scanners / spiders for a special directory. How to do this using htaccess ? I searched a bit and found a solution by blocking based on user agent:

 RewriteCond %{HTTP_USER_AGENT} googlebot 

Now I will need more user agents (for all known bots), and the rule should be valid only for my separate directory. I already have a robots.txt file, but not all scanners consider it ... Blocking by IP address is not an option. Or are there other solutions? I know the password, but first I need to ask if this will be an option. However, I am looking for a user agent based solution.

+6
source share
3 answers

You must have mod_rewrite enabled. Put it in .htaccess in this folder. If it is placed in another place (for example, the parent folder), then the RewriteRule template needs to be slightly modified to include this folder name).

 RewriteEngine On RewriteCond %{HTTP_USER_AGENT} (googlebot|bingbot|Baiduspider) [NC] RewriteRule .* - [R=403,L] 
  • I introduced only a few bots - you add others (letter case does not matter).
  • This rule will respond with the result code "403 Access Forbidden" for such requests. You can switch to another HTTP response code if you really want (403 is most suitable here, considering your requirements).
+16
source

Why use .htaccess or mod_rewrite for a job specifically designed for robots.txt ? Here is a snippet of robots.txt you need t block a specific set of directories.

 User-agent: * Disallow: /subdir1/ Disallow: /subdir2/ Disallow: /subdir3/ 

This will block all search /subdir1/ in the directories /subdir1/ , /subdir2/ and /subdir3/ .

See here for more details: http://www.robotstxt.org/orig.html

+10
source

I know that the topic is "old", but still, for ppl, which also landed here (like me), you can see the excellent 5g blacklist 2013 here .
This is great help and NO, not only for WordPress, but for all other sites. It works amazing IMHO.
Another worth a look, maybe Linux is scanning antispam via .htaccess

+5
source

Source: https://habr.com/ru/post/916546/


All Articles