Optimized mesh-connected networks for SIMD and MIMD architectures
ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
A fault tolerant massively parallel processing architecture
Journal of Parallel and Distributed Computing
Fault-Tolerant Array Processors Using Single-Track Switches
IEEE Transactions on Computers
The de Bruijn Multiprocessor Network: A Versatile Parallel Processing and Sorting Network for VLSI
IEEE Transactions on Computers
Fault tolerance in hypercube-derivative networks
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Fast computation using faulty hypercubes
STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
Addressing, Routing, and Broadcasting in Hexagonal Mesh Multiprocessors
IEEE Transactions on Computers
Efficient Algorithms for Reconfiguration in VLSI/WSI Arrays
IEEE Transactions on Computers
On Designing and Reconfiguring k-Fault-Tolerant Tree Architectures
IEEE Transactions on Computers
Running algorithms efficiently on faulty hypercubes
SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Designing fault-tolerant systems using automorphisms
Journal of Parallel and Distributed Computing
Wildcard Dimensions, Coding Theory and Fault-Tolerant Meshes and Hypercubes
IEEE Transactions on Computers
Efficient Determination of Maximum Incomplete Subcubes in Hypercubes with Faults
IEEE Transactions on Computers
Systematic Design of Fault-Tolerant Multiprocessors with Shared Buses
IEEE Transactions on Computers
Node-covering, Error-correcting Codes and Multiprocessors with Very High Average Fault Tolerance
IEEE Transactions on Computers
Efficient Algorithms for All-to-All Communications in Multiport Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
Reconfiguration for fault tolerance using graph grammars
ACM Transactions on Computer Systems (TOCS)
Optimal Elections in Faulty Loop Networks and Applications
IEEE Transactions on Computers
Fault-Free Hamiltonian Cycles in Faulty Arrangement Graphs
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Processor Arrays Using Additional Bypass Linking Allocated by Graph-Node Coloring
IEEE Transactions on Computers
Computing in the RAIN: A Reliable Array of Independent Nodes
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Meshes with Small Degree
IEEE Transactions on Computers
Fault-tolerant recursive least-squares computations on a mesh-connected parallel processor
Journal of Parallel and Distributed Computing
Survivable Computer Networks in the Presence of Partitioning
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Gracefully Degradable Pipeline Networks
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Totally defect-tolerant arrays capable of quick broadcasting
DFT '95 Proceedings of the IEEE International Workshop on Defect and Fault Tolerance in VLSI Systems
Spare processor allocation for fault tolerance in torus-based multicomputers
FTCS '96 Proceedings of the The Twenty-Sixth Annual International Symposium on Fault-Tolerant Computing (FTCS '96)
Node Covering, Error Correcting Codes and Multiprocessors with Very High Average Fault Tolerance
FTCS '95 Proceedings of the Twenty-Fifth International Symposium on Fault-Tolerant Computing
Tolerant Switched Local Area Networks
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Pancyclicity on Möbius cubes with maximal edge faults
Parallel Computing
Multitoroidal Interconnects For Tightly Coupled Supercomputers
IEEE Transactions on Parallel and Distributed Systems
Fault-tolerance and reconfiguration of circulant graphs and hypercubes
Proceedings of the 2008 Spring simulation multiconference
Applying fault-tolerant solutions of circulant graphs to multidimensional meshes
Computers & Mathematics with Applications
Developing fault-tolerant distributed loops
Information Processing Letters
Extending a distributed loop network to tolerate node failures
Proceedings of the Workshop on Parallel and Distributed Systems: Testing, Analysis, and Debugging
On reliability analysis of forward loop forward hop networks
ICDCIT'06 Proceedings of the Third international conference on Distributed Computing and Internet Technology
Fault-tolerant circulant digraphs networks
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Hi-index | 15.02 |
This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. The approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. The authors optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k=1, they present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. They also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, they give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes.