Off-chip communication architectures for high throughput network processors

Authors:
Jacob Engel;Taskin Kocak
Affiliations:
Marvell Semiconductors, Inc., Austin, TX 78759, USA;Dept. of Electrical and Electronic Engr., University of Bristol, Bristol BS8 1UB, UK
Venue:
Computer Communications
Year:
2009

Citing 18
Cited 0

The cosmic cube

Communications of the ACM - Special section on computer architecture
Finite-grain message passing concurrent computers

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Performance Analysis of k-ary n-cube Interconnection Networks

IEEE Transactions on Computers
The turn model for adaptive routing

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks

IEEE Transactions on Parallel and Distributed Systems
An overview of hierarchical control flow graph models

WSC '95 Proceedings of the 27th conference on Winter simulation
A Performance Model for Duato's Fully Adaptive Routing Algorithm in k$k$-Ary n$n$-Cubes

IEEE Transactions on Computers
The Odd-Even Turn Model for Adaptive Routing

IEEE Transactions on Parallel and Distributed Systems
A simple mathematical model of adaptive routing in wormhole k-ary n-cubes

Proceedings of the 2002 ACM symposium on Applied computing
Classical and Object-Oriented Software Engineering

Classical and Object-Oriented Software Engineering
Hypercube Communication Delay with Wormhole Routing

IEEE Transactions on Computers
Limits on Interconnection Network Performance

IEEE Transactions on Parallel and Distributed Systems
Virtual-Channel Flow Control

IEEE Transactions on Parallel and Distributed Systems
Modeling virtual channel flow control in hypercubes

HPCA '95 Proceedings of the 1st IEEE Symposium on High-Performance Computer Architecture
Deadlock Avoidance for Switches Based on Wormhole Networks

ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Multilayer VLSI Layout for Interconnection Networks

ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Robust, High-Speed Network Design for Large-Scale Multiprocessing

Robust, High-Speed Network Design for Large-Scale Multiprocessing
A survey of research and practices of Network-on-chip

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.25

Visualization

Abstract

In this paper, we propose a new interconnection mechanism for network line cards. We project that the packet storage needs for the next-generation networks will be much higher. Such that the number of memory modules required to store the packets will be more than that can be directly connected to the network processor (NPU). In other words, the NPU I/O pins are limited and they do not scale well with the growing number of memory modules and processing elements employed on the network line cards. As a result, we propose to explore more suitable off-chip interconnect and communication mechanisms that will replace the existing systems and that will provide extraordinary high throughput. In particular, we investigate if the packet-switched k-ary n-cube networks can be a solution. To the best of our knowledge, this is the first time, the k-ary n-cube networks are used on a board. We investigate multiple k-ary n-cube based interconnects and include a variation of 2-ary 3-cube interconnect called the 3D-mesh. All of the k-ary n-cube interconnects include multiple, highly efficient techniques to route, switch, and control packet flows in order to minimize congestion spots and packet loss within the interconnects. We explore the tradeoffs between implementation constraints and performance. Performance results show that k-ary n-cube topologies significantly outperform the existing line card interconnects and they are able to sustain higher traffic loads. Furthermore, the 3D-mesh reaches the highest performance results of all interconnects and allows future scalability to adopt more memories and/or processors to increase the line card's processing power.