Proactive Recovery in a Byzantine-Fault-Tolerant System
Miguel Castro and Barbara Liskov
Laboratory for Computer Science,
Massachusetts Institute of Technology,
545 Technology Square, Cambridge, MA 02139
This paper describes an asynchronous state-machine replication system that
tolerates Byzantine faults, which can be caused by malicious attacks or
software errors. Our system is the first to recover Byzantine-faulty replicas
proactively and it performs well because it uses symmetric rather than
public-key cryptography for authentication. The recovery mechanism allows
us to tolerate any number of faults over the lifetime of the system provided
fewer than 1/3 of the replicas become faulty within a window of
vulnerability that is small under normal conditions. The window may increase
under a denial-of-service attack but we can detect and respond to such
attacks. The paper presents results of experiments showing that overall
performance is good and that even a small window of vulnerability has little
impact on service latency.
This paper is available in PostScript or PDF:
Published in the Proceedings of the Fourth Symposium on Operating Systems
Design and Implementation, San Diego, USA, October 2000.
This research was supported by DARPA under contract
F30602-98-1-0237 monitored by the Air Force Research Laboratory.