Compiling high throughput network processors

Authors:
Maysam Lavasani;Larry Dennison;Derek Chiou
Affiliations:
Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, USA;Lightwolf Technologies, Walpole, MA, walpole, MA, USA;Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX, USA
Venue:
Proceedings of the ACM/SIGDA international symposium on Field Programmable Gate Arrays
Year:
2012

Citing 19
Cited 2

The click modular router

ACM Transactions on Computer Systems (TOCS)
Maintaining Statistics Counters in Router Line Cards

IEEE Micro
StreamIt: A Language for Streaming Applications

CC '02 Proceedings of the 11th International Conference on Compiler Construction
Mapping a domain specific language to a platform FPGA

Proceedings of the 41st annual Design Automation Conference
Tree bitmap: hardware/software IP lookups with incremental updates

ACM SIGCOMM Computer Communication Review
Utilizing Horizontal and Vertical Parallelism with a No-Instruction-Set Compiler for Custom Datapaths

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
A TCAM-based distributed parallel IP lookup scheme and performance analysis

IEEE/ACM Transactions on Networking (TON)
NetFPGA--An Open Platform for Gigabit-Rate Network Switching and Routing

MSE '07 Proceedings of the 2007 IEEE International Conference on Microelectronic Systems Education
From WiFi to WiMAX: Techniques for High-Level IP Reuse across Different OFDM Protocols

MEMOCODE '07 Proceedings of the 5th IEEE/ACM International Conference on Formal Methods and Models for Codesign
Optimus: efficient realization of streaming applications on FPGAs

CASES '08 Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems
RouteBricks: exploiting parallelism to scale software routers

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Understanding sources of inefficiency in general-purpose chips

Proceedings of the 37th annual international symposium on Computer architecture
PacketShader: a GPU-accelerated software router

Proceedings of the ACM SIGCOMM 2010 conference
A folded pipeline network processor architecture for 100 Gbit/s networks

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Chimpp: a click-based programming and simulation environment for reconfigurable networking hardware

Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Customizing virtual networks with partial FPGA reconfiguration

ACM SIGCOMM Computer Communication Review
Rethinking Digital Design: Why Design Must Change

IEEE Micro
Memory-efficient and scalable virtual routers using FPGA

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
Enforcing architectural contracts in high-level synthesis

Proceedings of the 48th Design Automation Conference

Scalable, high performance ethernet forwarding with CuckooSwitch

Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Application specific processor with high level synthesized instructions (abstract only)

Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Gorilla is a methodology for generating FPGA-based solutions especially well suited for data parallel applications with fine grain irregularity. Irregularity simultaneously destroys performance and increases power consumption on many data parallel processors such as General Purpose Graphical Processor Units (GPGPUs). Gorilla achieves high performance and low power through the use of FPGA-tailored parallelization techniques and application-specific hardwired accelerators, processing engines, and communication mechanisms. Automatic compilation from a stylized C language and templates that define the hardware structure coupled with the intrinsic flexibility of FPGAs provide high performance, low power, and programmability. Gorilla's capabilities are demonstrated through the generation of a family of core-router network processors processing up to 100Gbps (200MPPS for 64B packets) supporting any mix of IPv4, IPv6, and Multi-Protocol Label Switching (MPLS) packets on a single FPGA with off-chip IP lookup tables. A 40Gbps version of that network processor was run with an embedded test rig on a Xilinx Virtex-6 FPGA, verifying for performance and correctness. Its measured power consumption is comparable to full custom, commercial network processors. In addition, it is demonstrated how Gorilla can be used to generate merged virtual routers, saving FPGA resources.