Next:Algorithm Up:Contents Previous:System Model and Assumptions

3 Algorithm Properties

Our algorithm is a form of state machine replication [17,28]: the service is modeled as a state machine that is replicated across different nodes in a distributed system. The algorithm can be used to implement any replicated service with a state and some operations. The operations are not restricted to simple reads and writes; they can perform arbitrary computations.

The service is implemented by a set of replicas R and each replica is identified using an integer in {0, ..., |R| - 1}. Each replica maintains a copy of the service state and implements the service operations. For simplicity, we assume |R| = 3f+1 where f is the maximum number of replicas that may be faulty. Service clients and replicas are non-faulty if they follow the algorithm and if no attacker can impersonate them (e.g., by forging their MACs).

Like all state machine replication techniques, we impose two requirements on replicas: they must start in the same state, and they must be deterministic (i.e., the execution of an operation in a given state and with a given set of arguments must always produce the same result). We can handle some common forms of non-determinism using the technique we described in [6].

Our algorithm ensures safety for an execution provided at most f replicas become faulty within a window of vulnerability of size Tv. Safety means that the replicated service satisfies linearizability [12,5]: it behaves like a centralized implementation that executes operations atomically one at a time. Our algorithm provides safety regardless of how many faulty clients are using the service (even if they collude with faulty replicas). We will discuss the window of vulnerability further in Section 4.7.

The algorithm also guarantees liveness: non-faulty clients eventually receive replies to their requests provided (1) at most f replicas become faulty within the window of vulnerability Tv; and (2) denial-of-service attacks do not last forever, i.e., there is some unknown point in the execution after which all messages are delivered (possibly after being retransmitted) within some constant time d, or all non-faulty clients have received replies to their requests. Here, d is a constant that depends on the timeout values used by the algorithm to refresh keys, and trigger view-changes and recoveries.

Miguel Castro and Barbara Liskov, "Proactive Recovery in a Byzantine-Fault-Tolerant System", in Proceedings of the Fourth Symposium on Operating Systems Design and Implementation, San Diego, USA, October 2000.