Implementation of resilient, atomic data types
ACM Transactions on Programming Languages and Systems (TOPLAS) - Lecture notes in computer science Vol. 174
Robust Storage Structures for Crash Recovery
IEEE Transactions on Computers - The MIT Press scientific computation series
Local Correction of Helix(k) Lists
IEEE Transactions on Computers
Tentative steps toward a development method for interfering programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Local Concurrent Error Detection and Correction in Data Structures Using Virtual Backpointers
IEEE Transactions on Computers
Understanding fault-tolerant distributed systems
Communications of the ACM
Distributed reset (extended abstract)
FST and TC 10 Proceedings of the tenth conference on Foundations of software technology and theoretical computer science
Stabilizing Communication Protocols
IEEE Transactions on Computers - Special issue on protocol engineering
Smart cars and highways go global
IEEE Spectrum
The Performance of Parity Placements in Disk Arrays
IEEE Transactions on Computers
IEEE Transactions on Computers
SuperStabilizing protocols for dynamic distributed systems
Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Fault-tolerant real-time objects
Communications of the ACM
Component Based Design of Multitolerant Systems
IEEE Transactions on Software Engineering
Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance
IEEE Transactions on Software Engineering
Self-stabilizing systems in spite of distributed control
Communications of the ACM
Inductive methods for proving properties of programs
Communications of the ACM
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
A Fault Tolerant Replicated Storage System
Proceedings of the Third International Conference on Data Engineering
Notes on Data Base Operating Systems
Operating Systems, An Advanced Course
Distributed Systems - Architecture and Implementation, An Advanced Course
Mathematical Theory of Computation
Mathematical Theory of Computation
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Hi-index | 0.00 |
Multiprocessor systems are widely used in many application programs to enhance system reliability and performance. However, reliability does not come naturally with multiple processors. We develop a multi-invariant data structure approach to ensure efficient and robust access to shared data structures in multiprocessor systems. Essentially, the data structure is designed to satisfy two invariants, a strong invariant, and a weak invariant. The system operates at its peak performance when the strong invariant is true. The system will operate correctly even when only the weak invariant is true, though perhaps at a lower performance level. The design ensures that the weak invariant will always be true in spite of fail-stop processor failures during the execution. By allowing the system to converge to a state satisfying only the weak invariant, the overhead for incorporating fault tolerance can be reduced. In this paper, we present the basic idea of multi-invariant data structures. We also develop design rules that systematically convert fault-intolerant data abstractions into corresponding fault-tolerant versions. In this transformation, we augment the data structure and access algorithms to ensure that the system always converges to the weak invariant even in the presence of fail-stop processor failures. We also design methods for the detection of integrity violations and for restoring the strong invariant. Two data structures, namely, binary search tree and double-linked list, are used to illustrate the concept of multi-invariant data structures.