Routing and broadcasting in faulty hypercube computers

Authors:
T. C. Lee;J. P. Hayes
Affiliations:
Advanced Computer Architecture Laboratory, Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, Michigan;Advanced Computer Architecture Laboratory, Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, Michigan
Venue:
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Year:
1988

Citing 3
Cited 13

The cosmic cube

Communications of the ACM - Special section on computer architecture
Distributed fault-tolerance for large multiprocessor systems

ISCA '80 Proceedings of the 7th annual symposium on Computer Architecture
A large scale, homogeneous, fully distributed parallel machine, I

ISCA '77 Proceedings of the 4th annual symposium on Computer architecture

Reliable Broadcast in Hypercube Multicomputers

IEEE Transactions on Computers
A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks

IEEE Transactions on Parallel and Distributed Systems
Global Commutative and Associative Reduction Operations in Faulty SIMD Hypercubes

IEEE Transactions on Computers
A Fault-Tolerant Tree Communication Scheme for Hypercube Systems

IEEE Transactions on Computers
Adaptive Fault-Tolerant Deadlock-Free Routing in Meshes and Hypercubes

IEEE Transactions on Computers
All-to-All Broadcasting in Faulty Hypercubes

IEEE Transactions on Computers
A Fault-Tolerant Communication Scheme for Hypercube Computers

IEEE Transactions on Computers
Deadlock-Free Fault-Tolerant Routing in Injured Hypercubes

IEEE Transactions on Computers
Design and Evaluation of Hardware Strategies for Reconfiguring Hypercubes and Meshes Under Faults

IEEE Transactions on Computers
Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Routing in Modular Fault-Tolerant Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
Strong Fault-Tolerance: Parallel Routing in Networks with Faults

ICCS '01 Proceedings of the International Conference on Computational Science-Part II
Optimal broadcasting in injured hypercubes using directed safety levels

Journal of Parallel and Distributed Computing - Special section best papers from the 2002 international parallel and distributed processing symposium

Quantified Score

Hi-index	0.02

Visualization

Abstract

This paper examines routing and broadcasting algorithms for hypercube computers subject to node failures. First some simple message-passing algorithms are described which perform well with certain fault patterns, but poorly with others. The concept of an unsafe node is introduced to identify fault-free nodes that may cause communication difficulties in faulty hypercubes. It is then shown that by only using “feasible” paths that try to avoid unsafe nodes, routing and broadcasting can be substantially simplified. It is assumed that each active node is supplied with the fault status of all neighboring nodes within a specified radius k. A computationally efficient routing algorithm is presented which can route a message via a path of length no greater than p+2, where p is the minimum feasible distance from the source to the destination, provided that not all non-faulty nodes in the hypercube are unsafe, and k = 1. We further show that broadcasting can be achieved under the same fault conditions with only one more time unit than the fault-free case.