Annotation: Discusses tradeoffs in the design of giant-scale services that allow for graceful degredation of service under load and failure and support for online evolution. Advocates automatic upgrade systems, and describes three approaches: fast reboot (everyone at once), rolling upgrade (round-robin), and big flip (partition the system, then upgrade each partition). Insists that these systems need a safe and fast way to roll back to the old version, since new versions tend to be buggy. Mentions that many systems use a staging area where the new software is set up alongside the old software before going live --- makes switchover (in either direction) easy.
BibTeX entry:
@article{brewer01lessons, author = {Eric A. Brewer}, title = {Lessons from Giant-Scale Services}, journal = {IEEE Internet Computing}, month = jul, year = {2001} }
Sameer Ajmani