Fast, approximate synthesis of fractional Gaussian noise for generating self-similar network traffic
ACM SIGCOMM Computer Communication Review
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
A Necessary and Sufficient Condition for Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
Throughput-centric routing algorithm design
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Universal schemes for parallel communication
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
DyAD: smart routing for networks-on-chip
Proceedings of the 41st annual Design Automation Conference
Low-Latency Virtual-Channel Routers for On-Chip Networks
Proceedings of the 31st annual international symposium on Computer architecture
Adaptive channel queue routing on k-ary n-cubes
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
A low latency router supporting adaptivity for on-chip interconnects
Proceedings of the 42nd annual Design Automation Conference
Near-Optimal Worst-Case Throughput Routing for Two-Dimensional Mesh Networks
Proceedings of the 32nd annual international symposium on Computer Architecture
DyXY: a proximity congestion-aware deadlock-free dynamic routing method for network on chip
Proceedings of the 43rd annual Design Automation Conference
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
Express virtual channels: towards the ideal interconnection fabric
Proceedings of the 34th annual international symposium on Computer architecture
Flattened Butterfly Topology for On-Chip Networks
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Larrabee: a many-core x86 architecture for visual computing
ACM SIGGRAPH 2008 papers
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Indirect adaptive routing on large scale interconnection networks
Proceedings of the 36th annual international symposium on Computer architecture
Dynamic and Distributed Multipath Routing Policy for High-Speed Cluster Networks
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Destination-based adaptive routing on 2D mesh networks
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip
Proceedings of the 38th annual international symposium on Computer architecture
Hi-index | 0.00 |
The choice of routing algorithm plays a vital role in the performance of on-chip interconnection networks. Adaptive routing is appealing because it offers better latency and throughput than oblivious routing, especially under nonuniform and bursty traffic. The performance of an adaptive routing algorithm is determined by its ability to accurately estimate congestion in the network. In this regard, maintaining global congestion state using a separate monitoring network offers better congestion visibility into distant parts of the network compared to solutions relying only on local congestion. However, the main challenge in designing such routing schemes is to keep the logic and bandwidth overhead as low as possible to fit into the tight power, area, and delay budgets of on-chip routers. In this article, we propose a minimal destination-based adaptive routing strategy (DAR), where every node estimates the delay to every other node in the network, and routing decisions are based on these per-destination delay estimates. DAR outperforms Regional Congestion Awareness (RCA), the best previously known adaptive routing algorithm that uses nonlocal congestion state. The performance improvement is brought about by maintaining fine-grained per-destination delay estimates in DAR that are more accurate than regional congestion metrics measured in RCA. The increased accuracy is a consequence of the fact that the per-destination delay estimates are not corrupted by congestion on links outside the admissible routing paths to the destination. A scalable version of DAR, referred to as SDAR, is also proposed for minimizing the overheads associated with DAR in large network topologies. We show that DAR outperforms local adaptive routing by up to 79% and RCA by up to 58% in terms of latency on SPLASH-2 benchmarks. DAR and SDAR also outperform existing adaptive and oblivious routing algorithms in latency and throughput under synthetic traffic patterns on 8×8 and 16times;16 mesh topologies, respectively.