A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks
IEEE Transactions on Parallel and Distributed Systems
Adaptive Fault-Tolerant Deadlock-Free Routing in Meshes and Hypercubes
IEEE Transactions on Computers
Dynamically Configurable Message Flow Control for Fault-Tolerant Routing
IEEE Transactions on Parallel and Distributed Systems
Threshold-Based Mechanisms to Discriminate Transient from Intermittent Faults
IEEE Transactions on Computers
A Fault-Tolerant Routing Scheme for Meshes with Nonconvex Faults
IEEE Transactions on Parallel and Distributed Systems
Probability vectors: a new fault-tolerant routing algorithm for k-ary n-cubes
Proceedings of the 2002 ACM symposium on Applied computing
Journal of Parallel and Distributed Computing
A new routing mechanism for networks with irregular topology
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks
IEEE Transactions on Computers
Communication in Multicomputers with Nonconvex Faults
IEEE Transactions on Computers
Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels
IEEE Transactions on Parallel and Distributed Systems
A fault-tolerant wormhole routing scheme for torus networks with nonconvex faults
Information Processing Letters
A New Approach to Fault-Tolerant Wormhole Routing for Mesh-Connected Parallel Computers
IEEE Transactions on Computers
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Immunet: A Cheap and Robust Fault-Tolerant Packet Routing Mechanism
Proceedings of the 31st annual international symposium on Computer architecture
A Probabilistic Approach to Fault-Tolerant Routing Algorithm on Mesh Networks
ICPADS '04 Proceedings of the Parallel and Distributed Systems, Tenth International Conference
An efficient reconfiguration scheme for fault-tolerant meshes
Information Sciences—Informatics and Computer Science: An International Journal
A Routing Methodology for Achieving Fault Tolerance in Direct Networks
IEEE Transactions on Computers
Overview of the Blue Gene/L system architecture
IBM Journal of Research and Development
Blue Gene/L torus interconnection network
IBM Journal of Research and Development
EUROMICRO-PDP'02 Proceedings of the 10th Euromicro conference on Parallel, distributed and network-based processing
Understanding the interconnection network of SpiNNaker
Proceedings of the 23rd international conference on Supercomputing
A Multipath Fault-Tolerant Routing Method for High-Speed Interconnection Networks
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Symbiotic routing in future data centers
Proceedings of the ACM SIGCOMM 2010 conference
CamCubeOS: a key-based network stack for 3D torus cluster topologies
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Hi-index | 0.00 |
This work presents Immucube, a scalable and efficient mechanism to improve dependability of interconnection networks for parallel and distributed computers. Immucube achieves better flexibility and scalability than any other previous fault-tolerant mechanism in k-ary n-cubes. The proposal inherits from Immunet several advantages over other previous fault-tolerant routing algorithms: 1) allowing any temporal and spatial fault combination, 2) permitting automatic and application-transparent reconfiguration after any fault, and 3) requiring a negligible overhead in the absence of faults. Immucube introduces new important features, such as: 4) providing graceful performance degradation, even in very large interconnection networks, 5) tolerating transparent resource utilization after transitory faults or partial repair of faulty resources, 6) being able to deal with intermittent faults, and 7) being able to dynamically recover the original network performance when all the failed components have been repaired.