Introducing Fault-Tolerant Group Membership into the Collaborative Computing Transport Layer

  • Authors:
  • Roger J. Loader;James S. Pascoe;Vaidy S. Sunderam

  • Affiliations:
  • -;-;-

  • Venue:
  • ICCS '01 Proceedings of the International Conference on Computational Sciences-Part I
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we introduce the novel election based fault tolerance mechanisms recently incorporated into the Collaborative Computing Transport Layer (CCTL). CCTL offers the atomic reliable multicast facilities used in the Collaborative Computing Framework (CCF). Our approach utilizes a reliable IP multicast primitive to implement two electorial algorithms that not only form consensus, but efficiently deliver a compact matrix based view of the network. This matrix can subsequently be analyzed to identify specific network failures (e.g. partitioning). The underlying premise of the approach being that by basing fault tolerance on a reliable multicast primitive, we eliminate the need for specific keep-alive packets such as heartbeats.