I am launching a large selenium grid and trying to achieve high availability. Currently, I have one point of failure in the Hub - if this machine goes down, then my grid goes down. I want to achieve redundancy.
I am looking for approaches that people are taking to solve this problem. The hub is configured so that I cannot load the balance. Is the best I can achieve master / slave interception?
Are there any packaged solutions for this?
If not, any tips to mitigate the impact of such a failure (what logic triggers the transition to another resource), how to minimize the time of connecting nodes to a new hub?)
Thanks in advance!
source share