Here is my simplified use case:
- I have two built-in nodes, each of which runs in its own JVM, on the same physical machine. I run them and they form a simple cluster.
- both nodes try to get the same lock
- the first, to get a lock, holds it for 30 seconds.
- If I kill a node that holds the lock, the cluster needs something between 5 and 10 seconds to conclude that the node is dead and release the lock
My question is: can this interval between killing a node that locks the lock and the cluster actually free the lock? I need it to be less than 1 second.
I tried some of the available properties that seemed to be related to this problem:
hazelcast.socket.connect.timeout.seconds
hazelcast.client.heartbeat.timeout
hazelcast.client.invocation.timeout.seconds
None of this helped; I did not notice a change in the behavior of the lock.
Update:
These two seem to be correct:
<property name="hazelcast.socket.connect.timeout.seconds">1</property>
<property name="hazelcast.connection.monitor.max.faults">1</property>
I have yet to find out if this will cause stability problems in a real use scenario. In this simple test, it works quite well.
source
share