Piranha: A CORBA Tool For High Availability

Authors:
Silvano Maffeis
Affiliations:
-
Venue:
Computer
Year:
1997

Citing 6
Cited 11

Impossibility of distributed consensus with one faulty process

Journal of the ACM (JACM)
Horus: a flexible group communication system

Communications of the ACM
Fault-tolerant broadcasts and related problems

Distributed systems (2nd Ed.)
Reliable Distributed Computing with the ISIS Toolkit

Reliable Distributed Computing with the ISIS Toolkit
The object group design pattern

COOTS'96 Proceedings of the 2nd conference on USENIX Conference on Object-Oriented Technologies (COOTS) - Volume 2
Constructing reliable distributed communication systems with CORBA

IEEE Communications Magazine

Chameleon: A Software Infrastructure for Adaptive Fault Tolerance

IEEE Transactions on Parallel and Distributed Systems
A Framework for Evaluating Distributed Object Models and its Application to Web Engineering

Annals of Software Engineering
Hierarchical Error Detection in a Software Implemented Fault Tolerance (SIFT) Environment

IEEE Transactions on Knowledge and Data Engineering
AQuA: An Adaptive Architecture that Provides Dependable Distributed Objects

IEEE Transactions on Computers
A Scalable Fault-Tolerant Network Management System Built Using Distributed Object Technology

EDOC '97 Proceedings of the 1st International Conference on Enterprise Distributed Object Computing
CORBA Based Runtime Support for Load Distribution and Fault Tolerance

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
A Tailorable Distributed Programming Environment

Ada-Europe '02 Proceedings of the 7th Ada-Europe International Conference on Reliable Software Technologies
CCS Resource Management in Networked HPC Systems

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
A graphical environment for GLADE

Ada-Europe'03 Proceedings of the 8th Ada-Europe international conference on Reliable software technologies
FTRMI: fault-tolerant transparent RMI

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Transparently increasing RMI fault tolerance

ACM SIGAPP Applied Computing Review

Quantified Score

Hi-index	4.11

Visualization

Abstract

Distributed systems, such as satellite surveillance systems and real-time feeds for financial data, must be heterogeneous, interoperable, extensible, and available. Availability is a kind of fault tolerance: The system is able to provide important services despite partial failure of its computers or software objects. The Object Management Group's Common Object Request Broker Architecture addresses only the first three characteristics. With respect to heterogeneity, for example, programmers can hide details of the underlying hardware and system software behind a portable interface, using CORBA's Interface Definition Language. IDL allows CORBA objects to invoke operations on each other even when implemented in different languages and even when running on incompatible operating systems. Wrapper objects and Object Request Broker (ORB) gateways enable interoperability by letting programmers interface new technology to legacy information systems. Finally, CORBA supports the development of highly modular applications, so programmers can more easily achieve extensibility-as well as better maintainability. To help address availability and reliability, the author developed an experimental CORBA-based restart service and monitor called Piranha (not related to the Yale University system). Piranha acts as a network monitor that reports failures through a graphical user interface. It also acts as a manager, automatically restarting failed CORBA objects, replicating stateful objects (objects that maintain an internal set of values) on the fly, migrating objects from one host to another, and enforcing predefined replication degrees-numbers of copies-on groups of objects. The article first examines the ways in which a CORBA ORB should support availability. It then explains how Piranha affords availability.