Deadlock-Free Message Routing in Multiprocessor Interconnection Networks
IEEE Transactions on Computers
Warp: an integrated solution of high-speed parallel computing
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
A framework for adaptive routing in multicomputer networks
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
VLSI and parallel computation
Network and processor architecture for message-driven computers
VLSI and parallel computation
Planar-adaptive routing: low-cost adaptive networks for multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A fault-tolerant communication scheme for hypercube computers
IEEE Transactions on Computers
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks
IEEE Transactions on Parallel and Distributed Systems
Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks
IEEE Transactions on Computers
IEEE Transactions on Parallel and Distributed Systems
Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
IEEE Transactions on Parallel and Distributed Systems
Design of a Router for Fault-Tolerant Networks
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
The Reliable Router: A Reliable and High-Performance Communication Substrate for Parallel Computers
PCRCW '94 Proceedings of the First International Workshop on Parallel Computer Routing and Communication
A Gracefully Degrading and Energy-Efficient Modular Router Architecture for On-Chip Networks
Proceedings of the 33rd annual international symposium on Computer Architecture
Hi-index | 0.00 |
The current multiprocessors such as Cray T3D support interprocessor communication using partitioned dimension-order routers (PDRs). In a PDR implementation, the routing logic and switching hardware is partitioned into multiple modules, with each module suitable for implementation as a chip. This paper proposes a method to incorporate fault-tolerance into such routers with simple changes to the router structure and logic. The previously known fault-tolerant routing methods assume centralized crossbar based routers and are not applicable to multiprocessors with PDRs. The proposed technique works for convex fault model, using only local knowledge of faults. Using the proposed techniques and as few as four virtual channels per physical channels, torus networks with PDRs can handle faults without compromising deadlock- and livelock-freedom. Simulations for 2-dimensional torus and mesh networks show that the resulting fault-tolerant PDRs have performances similar to those of the crossbar based routers.