Performance analysis of multi-dimensional packet classification on programmable network processors

Authors:
Deepa Srinivasan;Wu-chang Feng
Affiliations:
IBM Corporation, 3039 Cornwallis Road, BL205/N206, Research Triangle Park, NC 27709, USA;Department of Computer Science, Portland State University, P.O. Box 751, Portland, OR 97207, USA
Venue:
Computer Communications
Year:
2005

Citing 8
Cited 1

Fast and scalable layer four switching

Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
High-speed policy-based packet forwarding using efficient multi-dimensional range matching

Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
Packet classification on multiple fields

Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Scalable packet classification

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Building a robust software-based router using network processors

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
IXP-1200 Programming

IXP-1200 Programming
Space Decomposition Techniques for Fast Layer-4 Switching

PfHSN '99 Proceedings of the IFIP TC6 WG6.1 & WG6.4 / IEEE ComSoc TC on on Gigabit Networking Sixth International Workshop on Protocols for High Speed Networks VI
Algorithms for packet classification

IEEE Network: The Magazine of Global Internetworking

Hint-based cache design for reducing miss penalty in HBS packet classification algorithm

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.24

Visualization

Abstract

Multi-field packet classification is frequently performed by network devices such as edge routers and firewalls-such devices can utilize programmable network processors to perform this compute-intensive task at nearly line speeds. The architectures of programmable network processors are typically highly parallel and a single algorithm can be mapped in different ways onto the hardware. In this paper, we study the performance of two different design mappings of the Bit Vector packet classification algorithm on the Intel^(R) IXP1200 network processor. We show that: (i) Overall, the parallel mapping has better packet processing rate (25% more) than the pipelined mapping; (ii) In the parallel mapping, a processing element's utilization can be considerably affected by code complexity, in terms of branching, because of significant time wasted (as much as 40% more) due to aborting instruction execution pipelines; (iii) In the pipelined mapping, multiple memory reads per packet can lower the overall performance.