Planar-adaptive routing: low-cost adaptive networks for multiprocessors
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The turn model for adaptive routing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Adaptive Fault-Tolerant Deadlock-Free Routing in Meshes and Hypercubes
IEEE Transactions on Computers
A Theory of Fault-Tolerant Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Fault-tolerant wormhole routing in mesh with overlapped solid fault regions
Parallel Computing
Reliable computer systems (3rd ed.): design and evaluation
Reliable computer systems (3rd ed.): design and evaluation
Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks
IEEE Transactions on Computers
Fault-Tolerant Wormhole Routing in Meshes without Virtual Channels
IEEE Transactions on Parallel and Distributed Systems
Adaptive Fault-tolerant Wormhole Routing in 2D Meshes
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Fault-Tolerant and Deadlock-Free Routing Protocol in 2D Meshes Based on Odd-Even Turn Model
IEEE Transactions on Computers
A fault-tolerant wormhole routing scheme for torus networks with nonconvex faults
Information Processing Letters
A New Approach to Fault-Tolerant Wormhole Routing for Mesh-Connected Parallel Computers
IEEE Transactions on Computers
Immunet: A Cheap and Robust Fault-Tolerant Packet Routing Mechanism
Proceedings of the 31st annual international symposium on Computer architecture
Multi-phase minimal fault-tolerant wormhole routing in meshes
Parallel Computing
An Efficient Fault-Tolerant Routing Methodology for Meshes and Tori
IEEE Computer Architecture Letters
A survey of research and practices of Network-on-chip
ACM Computing Surveys (CSUR)
Reliability modeling and management in dynamic microprocessor-based systems
Proceedings of the 43rd annual Design Automation Conference
ElastIC: An Adaptive Self-Healing Architecture for Unpredictable Silicon
IEEE Design & Test
Vicis: a reliable network for unreliable silicon
Proceedings of the 46th Annual Design Automation Conference
OE+IOE: a novel turn model based fault tolerant routing scheme for networks-on-chip
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Proceedings of the Third International Workshop on Network on Chip Architectures
On the design and analysis of fault tolerant NoC architecture using spare routers
Proceedings of the 16th Asia and South Pacific Design Automation Conference
A resilient on-chip router design through data path salvaging
Proceedings of the 16th Asia and South Pacific Design Automation Conference
Proceedings of the 16th Asia and South Pacific Design Automation Conference
A thermal-aware application specific routing algorithm for network-on-chip design
Proceedings of the 16th Asia and South Pacific Design Automation Conference
Exploiting inherent information redundancy to manage transient errors in NoC routing arbitration
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
A distributed and topology-agnostic approach for on-line NoC testing
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Energy and reliability oriented mapping for regular Networks-on-Chip
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
A fault-tolerant NoC scheme using bidirectional channel
Proceedings of the 48th Design Automation Conference
Adaptive inter-layer message routing in 3D networks-on-chip
Microprocessors & Microsystems
A novel graceful degradable routing algorithm for 3D on-chip networks
Proceedings of the 2012 Interconnection Network Architecture: On-Chip, Multi-Chip Workshop
Application-aware deadlock-free oblivious routing based on extended turn-model
Proceedings of the International Conference on Computer-Aided Design
Fault tolerance analysis of mesh networks with uniform versus nonuniform node failure probability
Information Processing Letters
A scalable and fault-tolerant network routing scheme for many-core and multi-chip systems
Journal of Parallel and Distributed Computing
NoCAlert: An On-Line and Real-Time Fault Detection Mechanism for Network-on-Chip Architectures
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Topology-agnostic fault-tolerant NoC routing method
Proceedings of the Conference on Design, Automation and Test in Europe
Addressing network-on-chip router transient errors with inherent information redundancy
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
A complete self-testing and self-configuring NoC infrastructure for cost-effective MPSoCs
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
Enabling power efficiency through dynamic rerouting on-chip
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
AFRA: a low cost high performance reliable routing for 3D mesh NoCs
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
A unified link-layer fault-tolerant architecture for network-based many-core embedded systems
Journal of Systems Architecture: the EUROMICRO Journal
A fault tolerant NoC architecture using quad-spare mesh topology and dynamic reconfiguration
Journal of Systems Architecture: the EUROMICRO Journal
Journal of Electronic Testing: Theory and Applications
Methods for fault tolerance in networks-on-chip
ACM Computing Surveys (CSUR)
Use it or lose it: wear-out and lifetime in future chip multiprocessors
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Supporting faulty banks in NUCA by NoC assisted remapping mechanisms
The Journal of Supercomputing
Hi-index | 0.00 |
Current trends in technology scaling foreshadow worsening transistor reliability as well as greater numbers of transistors in each system. The combination of these factors will soon make long-term product reliability extremely difficult in complex modern systems such as systems on a chip (SoC) and chip multiprocessor (CMP) designs, where even a single device failure can cause fatal system errors. Resiliency to device failure will be a necessary condition at future technology nodes. In this work, we present a network-on-chip (NoC) routing algorithm to boost the robustness in interconnect networks, by reconfiguring them to avoid faulty components while maintaining connectivity and correct operation. This distributed algorithm can be implemented in hardware with less than 300 gates per network router. Experimental results over a broad range of 2D-mesh and 2D-torus networks demonstrate 99.99% reliability on average when 10% of the interconnect links have failed.