Managing Fault Tolerance Transparently Using CORBA Services

  • Authors:
  • René Meier;Paddy Nixon

  • Affiliations:
  • -;-

  • Venue:
  • Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fault tolerance problems arise in large-scale distributed systems because application components may eventually fail due to hardware problems, operator mistakes or design faults. Fault tolerance mechanisms must be employed to reduce the susceptibility of a given system to failure. In this paper, we describe the design of an architecture to overcome potential application component failures, using CORBA, a distributed object middleware specified by the OMG. Of primary importance to this architecture is OMG's CORBA Object Trading Service as the mechanism to advertise and manage service offers for fault tolerant application components. This mechanism enables clients transparently to detect a failed connection to a service object, to discover a similar backup service object and to re-connect to it. This improves overall system stability and enables scalability.