On fault tolerant routings in general networks
PODC '86 Proceedings of the fifth annual ACM symposium on Principles of distributed computing
Reconfiguring a hypercube in the presence of faults
STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
Communication effect basic linear algebra computations on hypercube architectures
Journal of Parallel and Distributed Computing
IEEE Transactions on Computers
Hyperswitch network for the hypercube computer
ISCA '88 Proceedings of the 15th Annual International Symposium on Computer architecture
Efficient dispersal of information for security, load balancing, and fault tolerance
Journal of the ACM (JACM)
Hypercube message routing in the presence of faults
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Routing and broadcasting in faulty hypercube computers
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Fault-Tolerant Array Processors Using Single-Track Switches
IEEE Transactions on Computers
Reconfiguration of VLSI/WSI Mesh Array Processors with Two-Level Redundancy
IEEE Transactions on Computers
Near-optimal message routing and broadcasting in faulty hypercubes
International Journal of Parallel Programming
An Adaptive and Fault Tolerant Wormhole Routing Strategy for k-ary n-cubes
IEEE Transactions on Computers
Reconfiguration Strategies for VLSI Processor Arrays and Trees Using a Modified Diogenes Approach
IEEE Transactions on Computers
Tolerating Faults in Hypercubes Using Subcube Partitioning
IEEE Transactions on Computers - Special issue on fault-tolerant computing
Channel multiplexing in fault-tolerant modular multiprocessors
Journal of Parallel and Distributed Computing
Introduction to Algorithms: A Creative Approach
Introduction to Algorithms: A Creative Approach
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Depth-First Search Approach for Fault-Tolerant Routing in Hypercube Multicomputers
IEEE Transactions on Parallel and Distributed Systems
An Efficient Modular Spare Allocation Scheme and Its Application to Fault Tolerant Binary Hypercubes
IEEE Transactions on Parallel and Distributed Systems
Universal schemes for parallel communication
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Design of a Circuit-Switched Highly Fault-Tolerant k-ary n-cube
ICPP '97 Proceedings of the international Conference on Parallel Processing
Enhanced Cluster k-Ary n-Cube, A Fault-Tolerant Multiprocessor
IEEE Transactions on Computers
An efficient reconfiguration scheme for fault-tolerant meshes
Information Sciences—Informatics and Computer Science: An International Journal
An improved replacement algorithm in fault-tolerant meshes
Proceedings of the 2007 Summer Computer Simulation Conference
An efficient reconfiguration scheme for fault-tolerant meshes
Information Sciences: an International Journal
Hi-index | 0.01 |
In this paper, we consider a class of modular multiprocessor architectures in which spares are added to each module to cover for faulty nodes within that module, thus forming a fault-tolerant basic block (FTBB). In contrast to reconfiguration techniques that preserve the physical adjacency between active nodes in the system, our goal is to preserve the logical adjacency between active nodes by means of a routing algorithm which delivers messages successfully to their destinations. We introduce two-phase routing strategies that route messages first to their destination FTBB, and then to the destination nodes within the destination FTBB. Such a strategy may be applied to a variety of architectures including binary hypercubes and three-dimensional tori. In the presence of f faults in hypercubes and tori, we show that the worst case length of the message route is min {驴+f, (K+ 1)驴}+c where 驴 is the shortest path in the absence of faults, K is the number of spare nodes in an FTBB, and c is a small constant. The average routing overhead is much lower than the worst case overhead.