Concurrency control and recovery in database systems
Concurrency control and recovery in database systems
Architecture of fault-tolerant computers
Fault-tolerant computing: theory and techniques; Vol. 2
A Class of Inherently Fault Tolerant Distributed Programs
IEEE Transactions on Software Engineering
Parallel program design: a foundation
Parallel program design: a foundation
Design & analysis of fault tolerant digital systems
Design & analysis of fault tolerant digital systems
Uniform self-stabilizing rings
ACM Transactions on Programming Languages and Systems (TOPLAS)
A hundred impossibility proofs for distributed computing
Proceedings of the eighth annual ACM Symposium on Principles of distributed computing
Predicate calculus and program semantics
Predicate calculus and program semantics
Self-stabilizing extensions for message-passing systems
PODC '90 Proceedings of the ninth annual ACM symposium on Principles of distributed computing
Understanding fault-tolerant distributed systems
Communications of the ACM
Distributed reset (extended abstract)
FST and TC 10 Proceedings of the tenth conference on Foundations of software technology and theoretical computer science
Stabilizing Communication Protocols
IEEE Transactions on Computers - Special issue on protocol engineering
ACM Computing Surveys (CSUR)
A foundation of fault-tolerant computing
A foundation of fault-tolerant computing
Impossibility of distributed consensus with one faulty process
Journal of the ACM (JACM)
Fail-stop processors: an approach to designing fault-tolerant computing systems
ACM Transactions on Computer Systems (TOCS)
Self-stabilizing systems in spite of distributed control
Communications of the ACM
Solution of a problem in concurrent programming control
Communications of the ACM
Introduction to Program Fault Tolerance
Introduction to Program Fault Tolerance
The Science of Programming
Computer Networks
A Discipline of Programming
PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
1983 Invited address solved problems, unsolved problems and non-problems in concurrency
PODC '84 Proceedings of the third annual ACM symposium on Principles of distributed computing
Fault-containing self-stabilizing algorithms
PODC '96 Proceedings of the fifteenth annual ACM symposium on Principles of distributed computing
Component Based Design of Multitolerant Systems
IEEE Transactions on Software Engineering
Synthesis of fault-tolerant concurrent programs
PODC '98 Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing
Designing Masking Fault-Tolerance via Nonmasking Fault-Tolerance
IEEE Transactions on Software Engineering
Fundamentals of fault-tolerant distributed computing in asynchronous environments
ACM Computing Surveys (CSUR)
The ERICA switch algorithm for ABR traffic management in ATM networks
IEEE/ACM Transactions on Networking (TON)
Handling Obstacles in Goal-Oriented Requirements Engineering
IEEE Transactions on Software Engineering - special section on current trends in exception handling—part II
IEEE Transactions on Software Engineering
Semantics-based transaction processing: satisfying conflicting objectives
IEEE Parallel & Distributed Technology: Systems & Technology
IEEE Transactions on Software Engineering
Stabilization-preserving atomicity refinement
Journal of Parallel and Distributed Computing - Self-stabilizing distributed systems
Stabilization-Preserving Atomicity Refinement
Proceedings of the 13th International Symposium on Distributed Computing
Agents, Distributed Algorithms, and Stabilization
COCOON '00 Proceedings of the 6th Annual International Conference on Computing and Combinatorics
On the Security and Vulnerability of PING
WSS '01 Proceedings of the 5th International Workshop on Self-Stabilizing Systems
Dijkstra's Self-Stabilizing Algorithm in Unsupportive Environments
WSS '01 Proceedings of the 5th International Workshop on Self-Stabilizing Systems
Cooperating Mobile Agents and Stabilization
WSS '01 Proceedings of the 5th International Workshop on Self-Stabilizing Systems
Enhancing The Fault-Tolerance of Nonmasking Programs
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Synthesis of fault-tolerant concurrent programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Self-Stabilizing Real-Time OPS5 Production Systems
IEEE Transactions on Knowledge and Data Engineering
Superstabilizing mutual exclusion
Distributed Computing
Complexity Issues in Automated Synthesis of Failsafe Fault-Tolerance
IEEE Transactions on Dependable and Secure Computing
The Effect of the Specification Model on the Complexity of Adding Masking Fault Tolerance
IEEE Transactions on Dependable and Secure Computing
Self-organizing publish/subscribe
DSM '05 Proceedings of the 2nd international doctoral symposium on Middleware
Designing Run-Time Fault-Tolerance Using Dynamic Updates
SEAMS '07 Proceedings of the 2007 International Workshop on Software Engineering for Adaptive and Self-Managing Systems
Diconic addition of failsafe fault-tolerance
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Maintenance goals of agents in a dynamic environment: Formulation and policy construction
Artificial Intelligence
Assurance of dynamic adaptation in distributed systems
Journal of Parallel and Distributed Computing
An exercise in selfish stabilization
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Stabilization of Max-Min Fair Networks without Per-flow State
SSS '08 Proceedings of the 10th International Symposium on Stabilization, Safety, and Security of Distributed Systems
An Algorithm Evaluating System Stability to Process
ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Multicore Constraint-Based Automated Stabilization
SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Nash Equilibria in Stabilizing Systems
SSS '09 Proceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems
Incremental synthesis of fault-tolerant real-time programs
SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
Fault masking in tri-redundant systems
SSS'06 Proceedings of the 8th international conference on Stabilization, safety, and security of distributed systems
A pattern-based approach for modeling and analyzing error recovery
Architecting dependable systems IV
Distributed synthesis of fault-tolerant programs in the high atomicity model
SSS'07 Proceedings of the 9h international conference on Stabilization, safety, and security of distributed systems
Stabilization in dynamic systems with varying equilibrium
SSS'07 Proceedings of the 9h international conference on Stabilization, safety, and security of distributed systems
On verifying fault tolerance of distributed protocols
TACAS'08/ETAPS'08 Proceedings of the Theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems
Maintaining the Ranch topology
Journal of Parallel and Distributed Computing
Complexity issues in automated model revision without explicit legitimate state
SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
Safe flocking in spite of actuator faults
SSS'10 Proceedings of the 12th international conference on Stabilization, safety, and security of distributed systems
Model-based analysis and development of dependable systems
MBEERTS'07 Proceedings of the 2007 International Dagstuhl conference on Model-based engineering of embedded real-time systems
Generalized rabin(1) synthesis with applications to robust system synthesis
NFM'11 Proceedings of the Third international conference on NASA Formal methods
Specifying fault-tolerance using split precondition logic
ICDCN'10 Proceedings of the 11th international conference on Distributed computing and networking
Stabilization of max-min fair networks without per-flow state
Theoretical Computer Science
Feasibility of Stepwise Design of Multitolerant Programs
ACM Transactions on Software Engineering and Methodology (TOSEM)
dCTL: a branching time temporal logic for fault-tolerant system verification
SEFM'11 Proceedings of the 9th international conference on Software engineering and formal methods
SSS'05 Proceedings of the 7th international conference on Self-Stabilizing Systems
Stabilizing certificate dispersal
SSS'05 Proceedings of the 7th international conference on Self-Stabilizing Systems
Robustness in the presence of liveness
CAV'10 Proceedings of the 22nd international conference on Computer Aided Verification
Sentries and sleepers in sensor networks
OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems
Mechanical verification of automatic synthesis of fault-tolerant programs
LOPSTR'04 Proceedings of the 14th international conference on Logic Based Program Synthesis and Transformation
Adding fault-tolerance using pre-synthesized components
EDCC'05 Proceedings of the 5th European conference on Dependable Computing
A formal model for fault-tolerance in distributed systems
SAFECOMP'05 Proceedings of the 24th international conference on Computer Safety, Reliability, and Security
A state-based model of sensor protocols
Theoretical Computer Science
A theory of fault recovery for component-based models
SSS'12 Proceedings of the 14th international conference on Stabilization, Safety, and Security of Distributed Systems
Brief announcement: self-stabilizing resource discovery algorithm
Proceedings of the 2013 ACM symposium on Principles of distributed computing
A theory of robust omega-regular software synthesis
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
The authors formally define what it means for a system to tolerate a class of faults. The definition consists of two conditions. The first is that if a fault occurs when the system state is within the set of legal states, the resulting state is within some larger set and, if faults continue to occur, the system state remains within that larger set (closure). The second is that if faults stop occurring, the system eventually reaches a state within the legal set (convergence). The applicability of the definition for specifying and verifying the fault-tolerance properties of a variety of digital and computer systems is demonstrated. Using the definition, the authors obtain a simple classification of fault-tolerant systems. Methods for the systematic design of such systems are discussed.