Virtually Pipelined Network Memory

Authors:
Banit Agrawal;Timothy Sherwood
Affiliations:
University of California, Santa Barbara;University of California, Santa Barbara
Venue:
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Year:
2006

Citing 18
Cited 2

Pseudo-randomly interleaved memory

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Out-of-order vector architectures

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A performance comparison of contemporary DRAM architectures

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Memory access scheduling

Proceedings of the 27th annual international symposium on Computer architecture
Direct Rambus Technology: The New Main Memory Standard

IEEE Micro
Conflict-Free Access for Streams in Multimodule Memories

IEEE Transactions on Computers
On the Security of Randomized CBC-MAC Beyond the Birthday Paradox Limit: A New Construction

FSE '02 Revised Papers from the 9th International Workshop on Fast Software Encryption
Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Command Vector Memory Systems: High Performance at Low Cost

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A pipelined memory architecture for high throughput network processors

Proceedings of the 30th annual international symposium on Computer architecture
Efficient use of memory bandwidth to improve network processor throughput

Proceedings of the 30th annual international symposium on Computer architecture
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Design and Implementation of High-Performance Memory Systems for Future Packet Buffers

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Parallelism versus memory allocation in pipelined router forwarding engines

Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
A Tree Based Router Search Engine Architecture with Single Port Memories

Proceedings of the 32nd annual international symposium on Computer Architecture
Design of Randomized Multichannel Packet Storage for High Performance Routers

HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture

Proceedings of the 33rd annual international symposium on Computer Architecture
Robust TCP stream reassembly in the presence of adversaries

SSYM'05 Proceedings of the 14th conference on USENIX Security Symposium - Volume 14

High-bandwidth network memory system through virtual pipelines

IEEE/ACM Transactions on Networking (TON)
Design and analysis of a robust pipelined memory system

INFOCOM'10 Proceedings of the 29th conference on Information communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce virtually-pipelined memory, an architectural technique that efficiently supports high-bandwidth, uniform latency memory accesses, and high-confidence throughput even under adversarial conditions. We apply this technique to the network processing domain where memory hierarchy design is an increasingly challenging problem as network bandwidth increases. Virtual pipelining provides a simple to analyze programing model of a deep pipeline (deterministic latencies) with a completely different physical implementation (a memory system with banks and probabilistic mapping). This allows designers to effectively decouple the analysis of their algorithms and data structures from the analysis of the memory buses and banks. Unlike specialized hardware customized for a specific data-plane algorithm, our system makes no assumption about the memory access patterns. In the domain of network processors this will be of growing importance as the size of the routing tables, the complexity of the packet classification rules, and the amount of packet buffering required, all continue to grow at a staggering rate. We present a mathematical argument for our system's ability to provably provide bandwidth with high confidence and demonstrate its functionality and area overhead through a synthesizable design. We further show that, even though our scheme is general purpose to support new applications such as packet reassembly, it outperforms the state of the art in specialized packet buffering architectures.