Facebook externalhit_uatext robot lowercase URL

I am working on a site with urls similar to youtube. We generate identifiers on the server, and I chose base 62 (numbers, lower and upper case letters) so that they are shorter. So the urls could be something like example.com/user/123AbCaBc. It seems like the facebook robot regularly gets to my site with a lowercase version example.com/user/123AbCaBc. This results in a 404 error because the uppercase identifier is not in the database.

According to the logs, there are no other user agents creating 404, so this is definitely a robot, not a human. Here's the user agent I see:

facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)

This happens approximately every 4 minutes. Currently, I am not registering non-404 hits, so I'm not sure if there are other versions other than lowercase.

Server technology is nodejs / mongodb, but I don't see how this relates to the problem.

Is there something I can do to fix facebook? Is there a problem here, or should I squeak these log errors? Does anyone else have a similar problem?

+4
source share
1 answer

Is it possible that you are a Node "web server application" (using Express?) Currently do not support byte ranges. The crawler on Facebook has a reason to return to the subscript, as described here:

Take a look

, .

+3

Source: https://habr.com/ru/post/1568819/


All Articles