Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
Warp: an integrated solution of high-speed parallel computing
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
A framework for adaptive routing in multicomputer networks
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
VLSI and parallel computation
An Adaptive and Fault Tolerant Wormhole Routing Strategy for k-ary n-cubes
IEEE Transactions on Computers
Planar-adaptive routing: low-cost adaptive networks for multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A fault-tolerant communication scheme for hypercube computers
IEEE Transactions on Computers
The J-machine multicomputer: an architectural evaluation
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A comparison of adaptive wormhole routing algorithms
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
ICS '90 Proceedings of the 4th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
IEEE Transactions on Parallel and Distributed Systems
Configurable flow control mechanisms for fault-tolerant routing
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Theory of Fault-Tolerant Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Dynamically Configurable Message Flow Control for Fault-Tolerant Routing
IEEE Transactions on Parallel and Distributed Systems
Fault-Tolerant Communication with Partitioned Dimension-Order Routers
IEEE Transactions on Parallel and Distributed Systems
Software-Based Rerouting for Fault-Tolerant Pipelined Communication
IEEE Transactions on Parallel and Distributed Systems
Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks
IEEE Transactions on Computers
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Exploring the Design Space of Self-Regulating Power-Aware On/Off Interconnection Networks
IEEE Transactions on Parallel and Distributed Systems
Enhanced fault tolerant routing algorithms using a concept of "balanced ring"
Journal of Systems Architecture: the EUROMICRO Journal
A unified fault-tolerant routing scheme for a class of cluster networks
Journal of Systems Architecture: the EUROMICRO Journal
A fault-tolerant communication scheme for regular cluster networks
CIIT '07 The Sixth IASTED International Conference on Communications, Internet, and Information Technology
A routing methodology for dynamic fault tolerance in meshes and tori
HiPC'07 Proceedings of the 14th international conference on High performance computing
A new adaptive fault-tolerant protocol for direct multiprocessors networks
ICCOM'06 Proceedings of the 10th WSEAS international conference on Communications
A highly robust distributed fault-tolerant routing algorithm for NoCs with localized rerouting
Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop
Topology Agnostic Dynamic Quick Reconfiguration for Large-Scale Interconnection Networks
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Hi-index | 0.00 |
We present a method to enhance wormhole routing algorithms for deadlock-free fault-tolerant routing in tori. We consider arbitrarily-located faulty blocks and assume only local knowledge of faults. Messages are routed via shortest paths when there are no faults, and this constraint is only slightly relaxed to facilitate routing in the presence of faults. The key concept we use is that, for each fault region, a fault ring consisting of fault free nodes and physical channels can be formed around it. These fault rings can be used to route messages around fault regions. We prove that at most four additional virtual channels are sufficient to make any fully-adaptive algorithm tolerant to multiple faulty blocks in torus networks. As an example of this technique, we present simulation results for a fully-adaptive algorithm and show that good performance can be obtained with as many as 10% links faulty.