If we run multi-paxos, then node can see:
Propose(N) Accept!(N,Vn) Accept!(N+1,Vm) Accept!(N+4,Vo) // huh? where is +2, +3? Accept!(N+5,Vp)
This may be due to the fact that:
- There was a stable leader, but the network local to this node dropped another delay of +2 and +3.
- There was a failure, so there were two attempts to offer such that +2 and +3 were unsuccessful round offers.
In general, operations on a distributed end state machine do not commute, so a node must apply all operations in order. This means that the node must be able to distinguish between two cases. If these are unsuccessful offer rounds, node has no problem. If these are lost messages, this indicates that the node should wait until they return, try restoring the lost data (for example, request a snapshot to reinitialize and intercept).
What are the options or strategies for this and what kind of service data do they create?
This question is inspired In Paxos, can an acceptor take on a different meaning after it has already accepted it?
source share