Automation of browser with selenium: fingerprints, recognition and traceability?

I want to use selenium / webdriver to simulate a browser and cross site content. Even if this is not the fastest method, for me it has many advantages, such as running scripts, etc.

Many websites are denied access to them using an automated method, such as search engines such as google or bing.

For one tool, I need to clear the estimated score from Google for several keywords. It will look like this: simulate a browser that visits google.com and enters it into a keyword and resets the results, then after a short pause in the next key, clears the results and so on ...

My question is: is it possible for the website to recognize that I am using selenium to simulate a browser instead of using the browser manually? Especially the Google case raises some doubts. I know that selenium is partly developed by Google, or at least by some of the guys working for Google. So leaves some fingerprints on selenium or is it not possible to decide whether I use the browser on my own or simulate selenium, even for google?

+4
source share
2 answers

No, no one can see that you are using Selenium, rather than manually using the browser yourself using WebDriver. I'm not sure about the old Selenium RC, but it should be the same. Here's how it works:

  • Selenium opens a browser with a clean profile (or with a selected profile)
  • Selenium connects to the browser so that it can control it, manage it. But the browser still does most of the work. Basically, Selenium replaces user inputs on the browser, but no more.

You can easily verify this by reading the contents of the HTTP headers sent by your browser.

If you really need Selenium to be recognized by your server, you can use the Browsermob-proxy and add a custom header to your requests .


All that said is one thing you should know about. While there is no way to detect Selenium directly, there may be some indirect clues collected on the website you are visiting. Usually they include scanning too many requests made almost immediately - this may be a problem for you. Make sure your Selenium behaves as a user.


EDIT 2016/04:

Perhaps it is possible that fooobar.com/questions/55660 / ... states that the company can do this. My hunch - and this is nothing more than a hunch - is that they can run some JS that Selenium installs in the browser to work.

+3
source

Signs indicate yes, sites can reorganize what you use Selenium.
Counter example: www.stubhub.com detects and blocks my browser instance launched using Selenium, while the β€œnormal” browsing is performed manually (without using a browser launched by the Selenium web driver) without problems.

See this stackoverflow question for more information. Can a site be detected using selenium with chrome ribs?

+1
source

Source: https://habr.com/ru/post/1491422/


All Articles