High-bandwidth network memory system through virtual pipelines

Authors:
Banit Agrawal;Timothy Sherwood
Affiliations:
Computer Science Department, University of California, Santa Barbara, Santa Barbara, CA;Computer Science Department, University of California, Santa Barbara, Santa Barbara, CA
Venue:
IEEE/ACM Transactions on Networking (TON)
Year:
2009

Citing 22
Cited 1

Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Out-of-order vector architectures

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A performance comparison of contemporary DRAM architectures

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Memory access scheduling

Proceedings of the 27th annual international symposium on Computer architecture
Direct Rambus Technology: The New Main Memory Standard

IEEE Micro
Conflict-Free Access for Streams in Multimodule Memories

IEEE Transactions on Computers
On the Security of Randomized CBC-MAC Beyond the Birthday Paradox Limit: A New Construction

FSE '02 Revised Papers from the 9th International Workshop on Fast Software Encryption
Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Command Vector Memory Systems: High Performance at Low Cost

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A pipelined memory architecture for high throughput network processors

Proceedings of the 30th annual international symposium on Computer architecture
Efficient use of memory bandwidth to improve network processor throughput

Proceedings of the 30th annual international symposium on Computer architecture
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Fine-grain Priority Scheduling on Multi-channel Memory Systems

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Design and Implementation of High-Performance Memory Systems for Future Packet Buffers

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Parallelism versus memory allocation in pipelined router forwarding engines

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Memory Controller Optimizations for Web Servers

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
A Tree Based Router Search Engine Architecture with Single Port Memories

Proceedings of the 32nd annual international symposium on Computer Architecture
Design of Randomized Multichannel Packet Storage for High Performance Routers

HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture

Proceedings of the 33rd annual international symposium on Computer Architecture
Virtually Pipelined Network Memory

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Robust TCP stream reassembly in the presence of adversaries

SSYM'05 Proceedings of the 14th conference on USENIX Security Symposium - Volume 14
A Burst Scheduling Access Reordering Mechanism

HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture

DRAM-based statistics counter array architecture with performance guarantee

IEEE/ACM Transactions on Networking (TON)

Quantified Score

Hi-index	0.00

Visualization

Abstract

As network bandwidth increases, designing an effective memory system for network processors becomes a significant challenge. The size of the routing tables, the complexity of the packet classification rules, and the amount of packet buffering required all continue to grow at a staggering rate. Simply relying on large, fast SRAMs alone is not likely to be scalable or cost-effective. Instead, trends point to the use of low-cost commodity DRAM devices as a means to deliver the worst-case memory performance that network data-plane algorithms demand. While DRAMs can deliver a great deal of throughput, the problem is that memory banking significantly complicates the worst-case analysis, and specialized algorithms are needed to ensure that specific types of access patterns are conflict-free. We introduce virtually pipelined memory, an architectural technique that efficiently supports high bandwidth, uniform latency memory accesses, and high-confidence throughput even under adversarial conditions. Virtual pipelining provides a simple-to-analyze programming model of a deep pipeline (deterministic latencies) with a completely different physical implementation (a memory system with banks and probabilistic mapping). This allows designers to effectively decouple the analysis of their algorithms and data structures from the analysis of the memory buses and banks. Unlike specialized hardware customized for a specific data-plane algorithm, our system makes no assumption about the memory access patterns. We present a mathematical argument for our system's ability to provably provide bandwidth with high confidence and demonstrate its functionality and area overhead through a synthesizable design. We further show that, even though our scheme is general purpose to support new applications such as packet reassembly, it outperforms the state-of-the-art in specialized packet buffering architectures.