On verifying fault tolerance of distributed protocols

Authors:
Dana Fisman;Orna Kupferman;Yoad Lustig
Affiliations:
School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel and IBM Haifa Research Lab, Haifa, Israel;School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel
Venue:
TACAS'08/ETAPS'08 Proceedings of the Theory and practice of software, 14th international conference on Tools and algorithms for the construction and analysis of systems
Year:
2008

Citing 22
Cited 3

Limits for automatic verification of finite-state concurrent systems

Information Processing Letters
Optimal distributed algorithms for minimum weight spanning tree, counting, leader election, and related problems

STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Automata on infinite objects

Handbook of theoretical computer science (vol. B)
Closure and Convergence: A Foundation of Fault-Tolerant Computing

IEEE Transactions on Software Engineering - Special issue on software reliability
Reasoning about infinite computations

Information and Computation
Optimal distributed algorithm for minimum spanning trees revisited

Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing
Automatic verification of parameterized linear networks of processes

Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fail-stop processors: an approach to designing fault-tolerant computing systems

ACM Transactions on Computer Systems (TOCS)
Symbolic model checking with rich assertional languages

Theoretical Computer Science
Distributed Algorithms

Distributed Algorithms
Verifying Properties of Large Sets of Processes with Network Invariants

Proceedings of the International Workshop on Automatic Verification Methods for Finite State Systems
MONA 1.x: New Techniques for WS1S and WS2S

CAV '98 Proceedings of the 10th International Conference on Computer Aided Verification
Liveness and Acceleration in Parameterized Verification

CAV '00 Proceedings of the 12th International Conference on Computer Aided Verification
Regular Model Checking

CAV '00 Proceedings of the 12th International Conference on Computer Aided Verification
Reducing Model Checking of the Many to the Few

CADE-17 Proceedings of the 17th International Conference on Automated Deduction
Synthesis of fault-tolerant concurrent programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Brief announcement: linear time byzantine self-stabilizing clock synchronization

Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Liveness with invisible ranking

International Journal on Software Tools for Technology Transfer (STTT)
Distributed Systems: Principles and Paradigms (2nd Edition)

Distributed Systems: Principles and Paradigms (2nd Edition)
Regular Model Checking Using Inference of Regular Languages

Electronic Notes in Theoretical Computer Science (ENTCS)
On computing fixpoints in well-structured regular model checking, with applications to lossy channel systems

LPAR'06 Proceedings of the 13th international conference on Logic for Programming, Artificial Intelligence, and Reasoning
Flat acceleration in symbolic model checking

ATVA'05 Proceedings of the Third international conference on Automated Technology for Verification and Analysis

Behavioral automata composition for automatic topology independent verification of parameterized systems

Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Synthesis of Fault-Tolerant Distributed Systems

ATVA '09 Proceedings of the 7th International Symposium on Automated Technology for Verification and Analysis
Brief announcement: parameterized model checking of fault-tolerant distributed algorithms by abstraction

Proceedings of the 2013 ACM symposium on Principles of distributed computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed systems are composed of processes connected in some network. Distributed systems may suffer from faults: processes may stop, be interrupted, or be maliciously attacked. Fault-tolerant protocols are designed to be resistant to faults. Proving the resistance of protocols to faults is a very challenging problem, as it combines the parameterized setting that distributed systems are based-on, with the need to consider a hostile environment that produces the faults. Considering all the possible fault scenarios for a protocol is very difficult. Thus, reasoning about fault-tolerance protocols utterly needs formal methods. In this paper we describe a framework for verifying the fault tolerance of (synchronous or asynchronous) distributed protocols. In addition to the description of the protocol and the desired behavior, the user provides the fault type (e.g., failstop, Byzantine) and its distribution (e.g., at most half of the processes are faulty). Our framework is based on augmenting the description of the configurations of the system by a mask describing which processes are faulty. We focus on regular model checking and show how it is possible to compile the input for the model-checking problem to one that takes the faults and their distribution into an account, and perform regular model-checking on the compiled input. We demonstrate the effectiveness of our framework and argue for its generality.