Fault-tolerant routing in MIN-based supercomputers

  • Authors:
  • Suresh Chalasani;Anujan Varma;C. S. Raghavendra

  • Affiliations:
  • Department of EE-Systems, University of Southern California, Los Angeles, CA;IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY;Department of EE-Systems, University of Southern California, Los Angeles, CA

  • Venue:
  • Proceedings of the 1990 ACM/IEEE conference on Supercomputing
  • Year:
  • 1990

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we study methods for routing data in supercomputers that use multistage interconnection networks (MINs), in the presence of faulty components in the network. These methods are applicable to existing multiprocessors like IBM GF11 and RP3. These methods are based on the concept of dynamic full-access(DFA) which refers to the ability of the network to route data from any processor in the system to any other processor in a finite number of passes through the network. We introduce a graph-model called DFA graph of a MIN and show how it can be used to determine the DFA capability of the MIN under a given set of network faults. When the faults in the network satisfy certain special properties, we present algorithms for routingany arbitrary permutation in a faulty Bene@@@@ network, andany Omega permutation in a faulty Omega network.These algorithms are simple and operate in a distributed fashion. These techniques allow a supercomputer to efficiently realize permutations of data needed in a parallel computing environment despite the presence of faults in the network.