Annual review of computer science vol. 1, 1986
Small forwarding tables for fast routing lookups
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Smart Memories: a modular reconfigurable architecture
Proceedings of the 27th annual international symposium on Computer architecture
Scalable packet classification
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A stream compiler for communication-exposed architectures
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Orion: a power-performance simulator for interconnection networks
Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A pipelined memory architecture for high throughput network processors
Proceedings of the 30th annual international symposium on Computer architecture
Exploiting ILP, TLP, and DLP with the polymorphous TRIPS architecture
Proceedings of the 30th annual international symposium on Computer architecture
Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
A Tree Based Router Search Engine Architecture with Single Port Memories
Proceedings of the 32nd annual international symposium on Computer Architecture
Dynamic pipelining: making IP-lookup truly scalable
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Gigabit routing on a software-exposed tiled-microprocessor
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
CAMP: fast and efficient IP lookup architecture
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
Advanced algorithms for fast and scalable deep packet inspection
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
Implementation and Evaluation of a Dynamically Routed Processor Operand Network
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Ethane: taking control of the enterprise
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
Floodless in seattle: a scalable ethernet architecture for large enterprises
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
A scalable, commodity data center network architecture
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
PLUG: flexible lookup modules for rapid deployment of new protocols in high-speed routers
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Experiences in Co-designing a Packet Classification Algorithm and a Flexible Hardware Platform
Proceedings of the 2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems
LEAP: latency- energy- and area-optimized lookup pipeline
Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
A general constraint-centric scheduling framework for spatial architectures
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
SWSL: software synthesis for network lookup
ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems
Hi-index | 0.00 |
This paper proposes a new architecture called Pipelined LookUp Grid (PLUG) that can perform data structure lookups in network processing. PLUGs are programmable and through simplicity achieve power efficiency. We draw upon one key insights: data structure lookups have natural structure that can be statically determined and exploited. The PLUG execution model transforms data-structure lookups into pipelined stages of computation and associates small code-blocks with data. The PLUG architecture is a tiled architecture with each tile consisting predominantly of SRAMs, a lightweight no-buffering router, and an array of lightweight computation cores. Using a principle of fixed delays in the execution model, the architecture is contention-free and completely statically scheduled thus achieving high energy efficiency. The architecture enables rapid deployment of new network protocols and generalizes as a data-structure accelerator. This paper describes the PLUG architecture, the compiler, and evaluates our RTL prototype PLUG chip synthesized on a 55nm technology library. We evaluate six diverse high-end network processing workloads including IPv4, IPv6, and Ethernet forwarding. We show that at a 55nm technology, a 16-tile PLUG occupies 58 mm2, provides 4MB on-chip storage, and sustains a clock frequency of 1 GHz. This translates to 1 billion lookups per second, a latency of 18ns to 219ns, and average power less than 1 Watt.