A theory for observational fault tolerance

Authors:
Adrian Francalanza;Matthew Hennessy
Affiliations:
University of Malta, Malta;University of Sussex, Brighton, England
Venue:
FOSSACS'06 Proceedings of the 9th European joint conference on Foundations of Software Science and Computation Structures
Year:
2006

Citing 9
Cited 4

Understanding fault-tolerant distributed systems

Communications of the ACM
Fail-stop processors: an approach to designing fault-tolerant computing systems

ACM Transactions on Computer Systems (TOCS)
Distributed processes and location failures

Theoretical Computer Science
Resource access control in systems of mobile agents

Information and Computation
Distributed Systems for System Architects

Distributed Systems for System Architects
Combinators and bisimulation proofs for restartable systems

Combinators and bisimulation proofs for restartable systems
Typed behavioural equivalences for processes in the presence of subtyping

Mathematical Structures in Computer Science
Towards a behavioural theory of access and mobility control in distributed systems

Theoretical Computer Science - Special issue: Foundations of wide area network computing
A theory of system behaviour in the presence of node and link failures

CONCUR 2005 - Concurrency Theory

An Observational Theory for Mobile Ad Hoc Networks

Electronic Notes in Theoretical Computer Science (ENTCS)
A theory of system behaviour in the presence of node and link failure

Information and Computation
An Observational Theory for Mobile Ad Hoc Networks (full version)

Information and Computation
A fault tolerance bisimulation proof for consensus

ESOP'07 Proceedings of the 16th European conference on Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

In general, faults cannot be prevented; instead, they need to be tolerated to guarantee certain degrees of software dependability. We develop a theory for fault tolerance for a distributed pi-calculus, whereby locations act as units of failure and redundancy is distributed across independently failing locations. We give formal definitions for fault tolerant programs in our calculus, based on the well studied notion of contextual equivalence. We then develop bisimulation proof techniques to verify fault tolerance properties of distributed programs and show they are sound with respect to our definitions for fault tolerance.