Effective Fault Treatment for Improving the Dependability of COTS and Legacy-Based Applications

Authors:
Andrea Bondavalli;Silvano Chiaradonna;Domenico Cotroneo;Luigi Romano
Affiliations:
IEEE;-;-;-
Venue:
IEEE Transactions on Dependable and Secure Computing
Year:
2004

Citing 24
Cited 6

Automatic Recognition of Intermittent Failures: An Experimental Study of Field Data

IEEE Transactions on Computers
Experimental analysis of computer system dependability

Fault-tolerant computer system design
A Metaobject Architecture for Fault-Tolerant Distributed Systems: The FRIENDS Approach

IEEE Transactions on Computers
The implementation of a CORBA object group service

Theory and Practice of Object Systems - Special issue high availability in CORBA
Chameleon: A Software Infrastructure for Adaptive Fault Tolerance

IEEE Transactions on Parallel and Distributed Systems
GUARDS: A Generic Upgradable Architecture for Real-Time Dependable Systems

IEEE Transactions on Parallel and Distributed Systems
Threshold-Based Mechanisms to Discriminate Transient from Intermittent Faults

IEEE Transactions on Computers
Building a dependable system from a legacy application with CORBA

Journal of Systems Architecture: the EUROMICRO Journal
Fault Tolerance in Multiprocessor Systems Without Dedicated Redundancy

IEEE Transactions on Computers
The Möbius Framework and Its Implementation

IEEE Transactions on Software Engineering
Integrating Reliable Memory in Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Middleware Support for Voting and Data Fusion

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
State Synchronization and Recovery for Strongly Consistent Replicated CORBA Objects

DSN '01 Proceedings of the 2001 International Conference on Dependable Systems and Networks (formerly: FTCS)
NFTAPE: Networked Fault Tolerance and Performance Evaluator

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Advanced Pattern Recognition for Detection of Complex Software Aging Phenomena in Online Transaction Processing Servers

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Implementing a CORBA-Based Architecture for Leveraging the Security Level of Existing Applications

On the Move to Meaningful Internet Systems, 2002 - DOA/CoopIS/ODBASE 2002 Confederated International Conferences DOA, CoopIS and ODBASE 2002
An Interoperable Replication Logic for CORBA Systems

DOA '00 Proceedings of the International Symposium on Distributed Objects and Applications
DOORS: Towards High-Performance Fault Tolerant CORBA

DOA '00 Proceedings of the International Symposium on Distributed Objects and Applications
Discriminating Fault Rate and Persistency to Improve Fault Treatment

FTCS '97 Proceedings of the 27th International Symposium on Fault-Tolerant Computing (FTCS '97)
AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects

SRDS '98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems
Implementation of Threshold-based Diagnostic Mechanisms for COTS-Based Applications

SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
The Lognormal Distribution of Software Failure Rates: Origin and Evidence

ISSRE '98 Proceedings of the The Ninth International Symposium on Software Reliability Engineering
Software Rejuvenation: Analysis, Module and Applications

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Measurement of Failure Rate in Widely Distributed Software

FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing

Heartbeat based fault diagnosis for mobile ad-hoc network

ACST'07 Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and Technology
The CRUTIAL Architecture for Critical Information Infrastructures

Architecting Dependable Systems V
Architecting and validating dependable systems: experiences and visions

Architecting dependable systems VII
CRUTIAL: the blueprint of a reference critical information infrastructure architecture

CRITIS'06 Proceedings of the First international conference on Critical Information Infrastructures Security
Concerning predictability in dependable component-based systems: classification of quality attributes

Architecting Dependable Systems III
A Recovery-Oriented Approach for Software Fault Diagnosis in Complex Critical Systems

International Journal of Adaptive, Resilient and Autonomic Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a novel methodology and an architectural framework for handling multiple classes of faults (namely, hardware-induced software errors in the application, process and/or host crashes or hangs, and errors in the persistent system stable storage) in a COTS and Legacy-based application. The basic idea is to use an evidence-accruing fault tolerance manager to choose and carry out one of multiple fault recovery strategies, depending upon the perceived severity of the fault. The methodology and the framework have been applied to a case study system consisting of a Legacy system, which makes use of a COTS DBMS for persistent storage facilities. A thorough performability analysis has also been conducted via combined use of direct measurements and analytical modeling. Experimental results demonstrate that effective fault treatment, consisting of careful diagnosis and damage assessment, plays a key role in leveraging the dependability of COTS and Legacy-based applications.