Reliable Distributed Sorting Through the Application-Oriented Fault Tolerance Paradigm

  • Authors:
  • B. M. McMillin;L. M. Ni

  • Affiliations:
  • -;-

  • Venue:
  • IEEE Transactions on Parallel and Distributed Systems
  • Year:
  • 1992

Quantified Score

Hi-index 0.01

Visualization

Abstract

A fault-tolerant parallel sorting algorithm developed using the application-oriented fault tolerance paradigm is presented. The algorithm is tolerant of one processor/link failure in an n-cube. The addition of reliability to the sorting algorithm results in a performance penalty. Asymptotically, the fault-tolerant algorithm is less costly than host sorting.Experimentally it is shown that fault-tolerant sorting quickly becomes more efficient than host sorting when the bitonic sort/merge is considered. The main contribution is the demonstration that the application-oriented fault tolerance paradigm is applicable to problems of a noniterative-convergent nature.