Authentication in distributed systems: theory and practice
ACM Transactions on Computer Systems (TOCS)
New Hybrid Fault Models for Asynchronous Approximate Agreement
IEEE Transactions on Computers
Safe kernel extensions without run-time checking
OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
ACM Transactions on Computer Systems (TOCS)
Practical Byzantine fault tolerance
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Accountable certificate management using undeniable attestations
Proceedings of the 7th ACM conference on Computer and communications security
OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
COCA: A secure distributed online certification authority
ACM Transactions on Computer Systems (TOCS)
Reaching Approximate Agreement with Mixed-Mode Faults
IEEE Transactions on Parallel and Distributed Systems
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
A Digital Signature Based on a Conventional Encryption Function
CRYPTO '87 A Conference on the Theory and Applications of Cryptographic Techniques on Advances in Cryptology
Secure History Preservation Through Timeline Entanglement
Proceedings of the 11th USENIX Security Symposium
On Certificate Revocation and Validation
FC '98 Proceedings of the Second International Conference on Financial Cryptography
he Timely Computing Base: Timely Actions in the Presence of Uncertain Timeliness
DSN '00 Proceedings of the 2000 International Conference on Dependable Systems and Networks (formerly FTCS-30 and DCCA-8)
Model-carrying code: a practical approach for safe execution of untrusted applications
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Separating agreement from execution for byzantine fault tolerant services
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Efficient Byzantine-Tolerant Erasure-Coded Storage
DSN '04 Proceedings of the 2004 International Conference on Dependable Systems and Networks
Byzantine disk paxos: optimal resilience with byzantine shared memory
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Shield: vulnerability-driven network filters for preventing known vulnerability exploits
Proceedings of the 2004 conference on Applications, technologies, architectures, and protocols for computer communications
The IBM PCIXCC: a new cryptographic coprocessor for the IBM eServer
IBM Journal of Research and Development
The LOCKSS peer-to-peer digital preservation system
ACM Transactions on Computer Systems (TOCS)
Solving Vector Consensus with a Wormhole
IEEE Transactions on Parallel and Distributed Systems
Optimal Resilience for Erasure-Coded Byzantine Distributed Storage
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Proactive resilience through architectural hybridization
Proceedings of the 2006 ACM symposium on Applied computing
Low complexity Byzantine-resilient consensus
Distributed Computing
A fresh look at the reliability of long-term digital storage
Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006
Glacier: highly durable, decentralized storage despite massive correlated failures
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Proactive recovery in a Byzantine-fault-tolerant system
OSDI'00 Proceedings of the 4th conference on Symposium on Operating System Design & Implementation - Volume 4
Efficient replica maintenance for distributed storage systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Strong accountability for network storage
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Sealing OS processes to improve dependability and safety
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Antiquity: exploiting a secure log for wide-area distributed storage
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Low-overhead byzantine fault-tolerant storage
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
PeerReview: practical accountability for distributed systems
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Attested append-only memory: making adversaries stick to their word
Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Preservation DataStores: Architecture for Preservation Aware Storage
MSST '07 Proceedings of the 24th IEEE Conference on Mass Storage Systems and Technologies
POTSHARDS: secure long-term storage without encryption
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Pergamum: replacing tape with energy efficient, reliable, disk-based archival storage
FAST'08 Proceedings of the 6th USENIX Conference on File and Storage Technologies
Super-efficient verification of dynamic outsourced databases
CT-RSA'08 Proceedings of the 2008 The Cryptopgraphers' Track at the RSA conference on Topics in cryptology
Uncertainty and predictability: can they be reconciled?
Future directions in distributed computing
The virtue of dependent failures in multi-site systems
HotDep'05 Proceedings of the First conference on Hot topics in system dependability
Beyond one-third faulty replicas in byzantine fault tolerant systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Saturn: a SAT-based tool for bug detection
CAV'05 Proceedings of the 17th international conference on Computer Aided Verification
Augmented smartphone applications through clone cloud execution
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Small trusted primitives for dependable systems
ACM SIGOPS Operating Systems Review
Integrity and consistency for untrusted services
SOFSEM'11 Proceedings of the 37th international conference on Current trends in theory and practice of computer science
Hi-index | 0.02 |
Fault-tolerant services typically make assumptions about the type and maximum number of faults that they can tolerate while providing their correctness guarantees; when such a fault threshold is violated, correctness is lost. We revisit the notion of fault thresholds in the context of long-term archival storage. We observe that fault thresholds are inevitably violated in long-term services, making traditional fault tolerance inapplicable to the long-term. In this work, we undertake a "reallocation of the fault-tolerance budget" of a long-term service. We split the service into service pieces, each of which can tolerate a different number of faults without failing (and without causing the whole service to fail): each piece can be either in a critical trusted fault tier, which must never fail, or an untrusted fault tier, which can fail massively and often, or other fault tiers in between. By carefully engineering the split of a long-term service into pieces that must obey distinct fault thresholds, we can prolong its inevitable demise. We demonstrate this approach with Bonafide, a long-term key-value store that, unlike all similar systems proposed in the literature, maintains integrity in the face of Byzantine faults without requiring self-certified data. We describe the notion of tiered fault tolerance, the design, implementation, and experimental evaluation of Bonafide, and argue that our approach is a practical yet significant improvement over the state of the art for long-term services.