High performance communications in processor networks
ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
The turn model for adaptive routing
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks
IEEE Transactions on Parallel and Distributed Systems
ROMM routing on mesh and torus networks
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
An efficient, fully adaptive deadlock recovery scheme: DISHA
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
A Traffic-Balanced Adaptive Wormhole Routing Scheme for Two-Dimensional Meshes
IEEE Transactions on Computers
The MIT Alewife machine: architecture and performance
25 years of the international symposia on Computer architecture (selected papers)
Logical effort: designing fast CMOS circuits
Logical effort: designing fast CMOS circuits
Evaluation of crossbar architectures for deadlock recovery routers
Journal of Parallel and Distributed Computing
Route packets, not wires: on-chip inteconnection networks
Proceedings of the 38th annual Design Automation Conference
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
Locality-preserving randomized oblivious routing on torus networks
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
IEEE Transactions on Parallel and Distributed Systems
ICCD '92 Proceedings of the 1991 IEEE International Conference on Computer Design on VLSI in Computer & Processors
Throughput-centric routing algorithm design
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Universal schemes for parallel communication
STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
A large scale, homogeneous, fully distributed parallel machine, I
ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
GOAL: a load-balanced adaptive routing algorithm for torus networks
Proceedings of the 30th annual international symposium on Computer architecture
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Power-driven Design of Router Microarchitectures in On-chip Networks
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Low-Latency Virtual-Channel Routers for On-Chip Networks
Proceedings of the 31st annual international symposium on Computer architecture
Adaptive channel queue routing on k-ary n-cubes
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Worst-case Traffic for Oblivious Routing Functions
IEEE Computer Architecture Letters
Increasing the throughput of an adaptive router in network-on-chip (NoC)
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
Express virtual channels: towards the ideal interconnection fabric
Proceedings of the 34th annual international symposium on Computer architecture
NoC-Based FPGA: Architecture and Routing
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Statistical Approach to NoC Design
NOCS '08 Proceedings of the Second ACM/IEEE International Symposium on Networks-on-Chip
Application-aware deadlock-free oblivious routing
Proceedings of the 36th annual international symposium on Computer architecture
Achieving predictable performance through better memory controller placement in many-core CMPs
Proceedings of the 36th annual international symposium on Computer architecture
Static virtual channel allocation in oblivious routing
NOCS '09 Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip
Path-based, randomized, oblivious, minimal routing
Proceedings of the 2nd International Workshop on Network on Chip Architectures
A high level power model for Network-on-Chip (NoC) router
Computers and Electrical Engineering
Outstanding research problems in NoC design: system, microarchitecture, and circuit perspectives
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Low-cost router microarchitecture for on-chip networks
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Design of a router for network-on-chip
International Journal of High Performance Systems Architecture
Destination-based adaptive routing on 2D mesh networks
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
A novel 3D layer-multiplexed on-chip network
Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Pseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Throughput-Effective On-Chip Networks for Manycore Accelerators
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Analysis of application-aware on-chip routing under traffic uncertainty
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
Deadlock-free fine-grained thread migration
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
An abacus turn model for time/space-efficient reconfigurable routing
Proceedings of the 38th annual international symposium on Computer architecture
Job scheduling for maximal throughput in autonomic computing systems
IWSOS'06/EuroNGI'06 Proceedings of the First international conference, and Proceedings of the Third international conference on New Trends in Network Architectures and Services conference on Self-Organising Systems
Reliability-aware platform optimization for 3D chip multi-processors
The Journal of Supercomputing
High-throughput differentiated service provision router architecture for wireless network-on-chip
International Journal of High Performance Systems Architecture
Flexible LDPC decoder architectures
VLSI Design - Special issue on Flexible Radio Design: Trends and Challenges in Digital Baseband Implementation
A load-balanced congestion-aware wireless network-on-chip design for multi-core platforms
Microprocessors & Microsystems
An efficient, low-cost routing framework for convex mesh partitions to support virtualization
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
Randomized partially-minimal routing: near-optimal oblivious routing for 3-D mesh networks
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Destination-based congestion awareness for adaptive routing in 2D mesh networks
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
Designing on-chip networks for throughput accelerators
ACM Transactions on Architecture and Code Optimization (TACO)
LEF: long edge first routing for two-dimensional mesh network on chip
Proceedings of the Sixth International Workshop on Network on Chip Architectures
Dual partitioning multicasting for high-performance on-chip networks
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Minimizing latency and maximizing throughput are important goals in the design of routing algorithms for interconnection networks. Ideally, we would like a routing algorithm to (a) route packets using the minimal number of hops to reduce latency and preserve communication locality, (b) deliver good worst-case and average-case throughput and (c) enable low-complexity (and hence, low latency) router implementation. In this paper, we focus on routing algorithms for an important class of interconnection networks: two dimensional (2D) mesh networks. Existing routing algorithms for mesh networks fail to satisfy one or more of design goals mentioned above. Variously, the routing algorithms suffer from poor worst case throughput (ROMM [13], DOR [23]), poor latency due to increased packet hops (VALIANT [31]) or increased latency due to hardware complexity (minimaladaptive [7, 30]). The major contribution of this paper is the design of an oblivious routing algorithm 驴 O1TURN 驴 with provable near-optimal worst-case throughput, good average-case throughput, low design complexity and minimal number of network hops for 2D-mesh networks, thus satisfying all the stated design goals. O1TURN offers optimal worst-case throughput when the network radix (k in a kxk network) is even. When the network radix is odd, O1TURN is within a 1/k2 factor of optimal worst-case throughput. O1TURN achieves superior or comparable average-case throughput with global traffic as well as local traffic. For example, O1TURN achieves 18.8%, 0.7% and 13.6% higher average-case throughput than DOR, ROMM and VALIANT routing, respectively when averaged over one million random traffic patterns on an 8x8 network. Finally, we demonstrate that O1TURN is well suited for a partitioned router implementation that is of similar delay complexity as a simple dimension-ordered router. Our implementation incurs a marginal increase in switch arbitration delay that is completely hidden in pipelined routers as it is not on the clock-critical path.