Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Small forwarding tables for fast routing lookups
SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
PipeRench: a co/processor for streaming multimedia acceleration
ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
System Design with SystemC
The Garp Architecture and C Compiler
Computer
Introducing the IA-64 Architecture
IEEE Micro
RaPiD - Reconfigurable Pipelined Datapath
FPL '96 Proceedings of the 6th International Workshop on Field-Programmable Logic, Smart Applications, New Paradigms and Compilers
Packet classification using multidimensional cutting
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Flexible architectures for engineering successful SOCs
Proceedings of the 41st annual Design Automation Conference
Optimized Generation of Data-Path from C Codes for FPGAs
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A Tree Based Router Search Engine Architecture with Single Port Memories
Proceedings of the 32nd annual international symposium on Computer Architecture
FCCM '06 Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
CAMP: fast and efficient IP lookup architecture
Proceedings of the 2006 ACM/IEEE symposium on Architecture for networking and communications systems
Ethane: taking control of the enterprise
Proceedings of the 2007 conference on Applications, technologies, architectures, and protocols for computer communications
OpenFlow: enabling innovation in campus networks
ACM SIGCOMM Computer Communication Review
PLUG: flexible lookup modules for rapid deployment of new protocols in high-speed routers
Proceedings of the ACM SIGCOMM 2009 conference on Data communication
RouteBricks: exploiting parallelism to scale software routers
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
PacketShader: a GPU-accelerated software router
Proceedings of the ACM SIGCOMM 2010 conference
EffiCuts: optimizing packet classification for memory and throughput
Proceedings of the ACM SIGCOMM 2010 conference
Design and implementation of the PLUG architecture for programmable and efficient network lookups
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
IP routing processing with graphic processors
Proceedings of the Conference on Design, Automation and Test in Europe
Communications of the ACM
LegUp: high-level synthesis for FPGA-based processor/accelerator systems
Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Dynamically Specialized Datapaths for energy efficient computing
HPCA '11 Proceedings of the 2011 IEEE 17th International Symposium on High Performance Computer Architecture
Bundled execution of recurring traces for energy-efficient general purpose processing
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Scalable IP lookup for Internet routers
IEEE Journal on Selected Areas in Communications
LEAP: latency- energy- and area-optimized lookup pipeline
Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems
Hi-index | 0.00 |
Data structure lookups are among the most expensive operations on routers' critical path in terms of latency and power. Therefore, efficient lookup engines are crucial. Several approaches have been proposed,based on either custom ASICs, general-purpose processors,or specialized engines. ASICs enable high performance but have long design cycle and scarce flexibility, while general-purpose processors present the opposite trade-off. Specialized programmable engines achieve some of the benefits of both approaches, but are still hard to program and limited either in terms of flexibility or performance. In this paper we investigate a different design point. Our solution,SWSL (SoftWare Synthesis for network Lookup) generates hardware logic directly from lookup applications written in C++. Therefore, it retains a simple programming model yet leads to significant performance and power gains. Moreover, compiled application can be deployed on either FPGA or ASIC, enabling a further trade-off between flexibility and performance. While most high-level synthesis compilers focus on loop acceleration, SWSL generates entire lookup chains performing aggressive pipelining to achieve high throughput. Initial results are promising: compared with a previously proposed solution, SWSL gives 2 - 4x lower latency and 3 - 4x reduced chip area with reasonable power consumption.