High-performance IPv6 forwarding algorithm for multi-core and multithreaded network processor

Authors:
Xianghui Hu;Xinan Tang;Bei Hua
Affiliations:
University of Science and Tech. of China, Hefei, China;Intel Compiler Lab., Santa Clara, California;University of Science and Tech. of China, Hefei, China
Venue:
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
2006

Citing 18
Cited 8

Small forwarding tables for fast routing lookups

SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Scalable high speed IP routing lookups

SIGCOMM '97 Proceedings of the ACM SIGCOMM '97 conference on Applications, technologies, architectures, and protocols for computer communication
Experiences with non-numeric applications on multithreaded architectures

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Faster IP lookups using controlled prefix expansion

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric

Journal of the ACM (JACM)
Automatically partitioning threads for multithreaded architectures

Journal of Parallel and Distributed Computing - Special issue on compilation and architectural support for parallel applications
Improving server software support for simultaneous multithreaded processors

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Programming challenges in network processor deployment

Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems
Tree bitmap: hardware/software IP lookups with incremental updates

ACM SIGCOMM Computer Communication Review
IBM PowerNP network processor: Hardware, software, and applications

IBM Journal of Research and Development
Shangri-La: achieving high performance from compiled network applications while enabling ease of programming

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Automatically partitioning packet processing applications for pipelined architectures

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Exposing speculative thread parallelism in SPEC2000

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Dynamic pipelining: making IP-lookup truly scalable

Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
Non-random generator for IPv6 tables

HOTI '04 Proceedings of the High Performance Interconnects, 2004. on Proceedings. 12th Annual IEEE Symposium
TrieC: a high-speed IPv6 lookup with fast updates using network processor

ICESS'05 Proceedings of the Second international conference on Embedded Software and Systems
High-speed IP routing with binary decision diagrams based hardware address lookup engine

IEEE Journal on Selected Areas in Communications
Survey and taxonomy of IP address lookup algorithms

IEEE Network: The Magazine of Global Internetworking

High-performance packet classification algorithm for many-core and multithreaded network processor

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Towards high-performance flow-level packet processing on multi-core network processors

Proceedings of the 3rd ACM/IEEE Symposium on Architecture for networking and communications systems
High-performance packet classification algorithm for multithreaded IXP network processor

ACM Transactions on Embedded Computing Systems (TECS)
Scalable packet classification using interpreting: a cross-platform multi-core solution

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Protocol offload analysis by simulation

Journal of Systems Architecture: the EUROMICRO Journal
Practice of parallelizing network applications on multi-core architectures

Proceedings of the 23rd international conference on Supercomputing
Flashlook: 100-Gbps hash-tuned route lookup architecture

HPSR'09 Proceedings of the 15th international conference on High Performance Switching and Routing
SIP server performance on multicore systems

IBM Journal of Research and Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

IP forwarding is one of the main bottlenecks in Internet backbone routers, as it requires performing the longest-prefix match at 10Gbps speed or higher. IPv6 forwarding further exacerbates the situation because its search space is quadrupled. We propose a high-performance IPv6 forwarding algorithm TrieC, and implement it efficiently on the Intel IXP2800 network processor (NPU). Programming the multi-core and multithreaded NPU is a daunting task. We study the interaction between the parallel algorithm design and the architecture mapping to facilitate efficient algorithm implementation. We experiment with an architecture-aware design principle to guarantee the high performance of the resulting algorithm.This paper investigates the main software design issues that have dramatic performance impacts on any NPU based implementation: memory space reduction, instruction selection, data allocation, task partitioning, latency hiding, and thread synchronization. In the paper, we provide insight on how to design an NPU-aware algorithm for high-performance networking applications. Based on the detailed performance analysis of the TrieC algorithm, we provide guidance on developing high-performance networking applications for the multi-core and multithreaded architecture.