Pseudo-randomly interleaved memory
ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Out-of-order vector architectures
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
A performance comparison of contemporary DRAM architectures
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Proceedings of the 27th annual international symposium on Computer architecture
Conflict-Free Access for Streams in Multimodule Memories
IEEE Transactions on Computers
On the Security of Randomized CBC-MAC Beyond the Birthday Paradox Limit: A New Construction
FSE '02 Revised Papers from the 9th International Workshop on Fast Software Encryption
Access Order and Effective Bandwidth for Streams on a Direct Rambus Memory
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Command Vector Memory Systems: High Performance at Low Cost
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
A pipelined memory architecture for high throughput network processors
Proceedings of the 30th annual international symposium on Computer architecture
Efficient use of memory bandwidth to improve network processor throughput
Proceedings of the 30th annual international symposium on Computer architecture
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Design and Implementation of High-Performance Memory Systems for Future Packet Buffers
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Parallelism versus memory allocation in pipelined router forwarding engines
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
A Tree Based Router Search Engine Architecture with Single Port Memories
Proceedings of the 32nd annual international symposium on Computer Architecture
Design of Randomized Multichannel Packet Storage for High Performance Routers
HOTI '05 Proceedings of the 13th Symposium on High Performance Interconnects
Chisel: A Storage-efficient, Collision-free Hash-based Network Processing Architecture
Proceedings of the 33rd annual international symposium on Computer Architecture
Robust TCP stream reassembly in the presence of adversaries
SSYM'05 Proceedings of the 14th conference on USENIX Security Symposium - Volume 14
High-bandwidth network memory system through virtual pipelines
IEEE/ACM Transactions on Networking (TON)
Design and analysis of a robust pipelined memory system
INFOCOM'10 Proceedings of the 29th conference on Information communications
Hi-index | 0.00 |
We introduce virtually-pipelined memory, an architectural technique that efficiently supports high-bandwidth, uniform latency memory accesses, and high-confidence throughput even under adversarial conditions. We apply this technique to the network processing domain where memory hierarchy design is an increasingly challenging problem as network bandwidth increases. Virtual pipelining provides a simple to analyze programing model of a deep pipeline (deterministic latencies) with a completely different physical implementation (a memory system with banks and probabilistic mapping). This allows designers to effectively decouple the analysis of their algorithms and data structures from the analysis of the memory buses and banks. Unlike specialized hardware customized for a specific data-plane algorithm, our system makes no assumption about the memory access patterns. In the domain of network processors this will be of growing importance as the size of the routing tables, the complexity of the packet classification rules, and the amount of packet buffering required, all continue to grow at a staggering rate. We present a mathematical argument for our system's ability to provably provide bandwidth with high confidence and demonstrate its functionality and area overhead through a synthesizable design. We further show that, even though our scheme is general purpose to support new applications such as packet reassembly, it outperforms the state of the art in specialized packet buffering architectures.