Improving reliability of cooperative concurrent systems with exception flow analysis

Authors:
Fernando Castor Filho;Alexander Romanovsky;Cecília Mary F. Rubira
Affiliations:
Informatics Center, Federal University of Pernambuco, Av. Prof. Lus Freire s/n, 50740-540 Recife, PE, Brazil;School of Computing Science, Newcastle University, Newcastle NE1 7RU, UK;Institute of Computing, State University of Campinas, P.O. Box 6176, 13084-971 Campinas, SP, Brazil
Venue:
Journal of Systems and Software
Year:
2009

Citing 30
Cited 0

Error recovery in asynchronous systems

IEEE Transactions on Software Engineering
Statecharts: A visual formalism for complex systems

Science of Computer Programming
Using Z: specification, refinement, and proof

Using Z: specification, refinement, and proof
The B-book: assigning programs to meanings

The B-book: assigning programs to meanings
A distributed object-oriented framework for dependable multiparty interactions

Proceedings of the 14th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Using coordinated atomic actions to design safety-critical systems: a production cell case study

Software—Practice & Experience
Alcoa: the alloy constraint analyzer

Proceedings of the 22nd international conference on Software engineering
Exception handling: issues and a proposed notation

Communications of the ACM
Alloy: a lightweight object modelling notation

ACM Transactions on Software Engineering and Methodology (TOSEM)
An application of fault tolerance patterns and coordinated atomic actions to a problem in railway scheduling

ACM SIGOPS Operating Systems Review
Rigorous Development of an Embedded Fault-Tolerant System Based on Coordinated Atomic Actions

IEEE Transactions on Computers - Special issue on fault-tolerant embedded systems
The J2EE tutorial

The J2EE tutorial
A Comparitive study of exception handling mechanisms for building dependable object-oriented software

Journal of Systems and Software
Transaction Processing: Concepts and Techniques

Transaction Processing: Concepts and Techniques
Fault Tolerance: Principles and Practice

Fault Tolerance: Principles and Practice
Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers

Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers
The 4+1 View Model of Architecture

IEEE Software
A Field Guide to Boxology: Preliminary Classification of Architectural Styles for Software Systems

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Rigorous Development of a Safety-Critical System Based on Coordinated Atomic Actions

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Structuring Integrated Web Applications for Fault Tolerance

ISADS '03 Proceedings of the The Sixth International Symposium on Autonomous Decentralized Systems (ISADS'03)
Fault Tolerance in Concurrent Object-Oriented Software through Coordinated Error Recovery

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Static analysis to support the evolution of exception structure in object-oriented systems

ACM Transactions on Software Engineering and Methodology (TOSEM)
Finding and preventing run-time error handling mistakes

OOPSLA '04 Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Structured Stochastic Modeling of Fault-Tolerant Systems

MASCOTS '04 Proceedings of the The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems
Exception handling in the development of dependable component-based systems

Software—Practice & Experience - Research Articles
Verification of coordinated exception handling

Proceedings of the 2006 ACM symposium on Applied computing
Exceptions and aspects: the devil is in the details

Proceedings of the 14th ACM SIGSOFT international symposium on Foundations of software engineering
CAA-DRIP: a framework for implementing Coordinated Atomic Actions

ISSRE '06 Proceedings of the 17th International Symposium on Software Reliability Engineering
Exception-Chain Analysis: Revealing Exception Handling Architecture in Java Server Applications

ICSE '07 Proceedings of the 29th international conference on Software Engineering
EJFlow: taming exceptional control flows in aspect-oriented programming

Proceedings of the 7th international conference on Aspect-oriented software development

Quantified Score

Hi-index	0.00

Visualization

Abstract

Developers of fault-tolerant distributed systems need to guarantee that fault tolerance mechanisms they build are in themselves reliable. Otherwise, these mechanisms might in the end negatively affect overall system dependability, thus defeating the purpose of introducing fault tolerance into the system. To achieve the desired levels of reliability, mechanisms for detecting and handling errors should be developed rigorously or formally. We present an approach to modeling and verifying fault-tolerant distributed systems that use exception handling as the main fault tolerance mechanism. In the proposed approach, a formal model is employed to specify the structure of a system in terms of cooperating participants that handle exceptions in a coordinated manner, and coordinated atomic actions serve as representatives of mechanisms for exception handling in concurrent systems. We validate the approach through two case studies: (i) a system responsible for managing a production cell, and (ii) a medical control system. In both systems, the proposed approach has helped us to uncover design faults in the form of implicit assumptions and omissions in the original specifications.