Randomized partially-minimal routing: near-optimal oblivious routing for 3-D mesh networks

Authors:
Rohit Sunkam Ramanujam;Bill Lin
Affiliations:
Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA;Department of Electrical and Computer Engineering, University of California at San Diego, La Jolla, CA
Venue:
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Year:
2012

Citing 15
Cited 0

Deadlock-Free Message Routing in Multiprocessor Interconnection Networks

IEEE Transactions on Computers
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks

IEEE Transactions on Parallel and Distributed Systems
ROMM routing on mesh and torus networks

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Worst-case traffic for oblivious routing functions

Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Universal schemes for parallel communication

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
A large scale, homogeneous, fully distributed parallel machine, I

ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
Dynamic Voltage Scaling with Links for Power Optimization of Interconnection Networks

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Scalar Operand Networks: On-Chip Interconnect for ILP in Partitioned Architectures

HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
Power-driven Design of Router Microarchitectures in On-chip Networks

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
3D Processing Technology and Its Impact on iA32 Microprocessors

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Near-Optimal Worst-Case Throughput Routing for Two-Dimensional Mesh Networks

Proceedings of the 32nd annual international symposium on Computer Architecture
Demystifying 3D ICs: The Pros and Cons of Going Vertical

IEEE Design & Test
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory

Proceedings of the 33rd annual international symposium on Computer Architecture
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing viability of 3-D silicon integration technology has opened new opportunities for chip architecture innovations. One direction is in the extension of 2-D mesh-based tiled chip-multiprocessor architectures into three dimensions. This paper focuses on efficient routing algorithms for such 3-D mesh networks. Existing routing algorithms suffer from either poor worst-case throughput (DOR, ROMM) or poor latency (VAL). Although the minimal routing algorithm O1TURN proposed in already achieves near-optimal worst-case throughput for 2-D mesh networks, the optimality result does not extend to higher dimensions. For 3-D and higher dimensional meshes, the worst-case throughput of O1TURN degrades tremendously. The main contribution of this paper is a new oblivious routing algorithm for 3-D mesh networks called randomized partially-minimal (RPM) routing. RPM provably achieves optimal worst-case throughput for 3-D meshes when the network radix k is even and within a factor of 1/k2 of optimal worst-case throughput when k is odd. Finally, whereas VAL achieves optimal worst-case throughput at a penalty factor of 2 in average latency over DOR, RPM achieves (near) optimal worst-case throughput with a much smaller factor of 1.33. For practical asymmetric 3-D mesh configurations where the number of device layers are fewer than the number of tiles along the edge of a layer, the average latency of RPM reduces to just a factor of 1.11 to 1.19 of DOR. Additionally, a variant of RPM called randomized minimal first (RMF) routing is proposed, which leverages the inherent load-balancing properties of the network traffic to further reduce packet latency without compromising throughput.