High-performance packet classification algorithm for multithreaded IXP network processor

  • Authors:
  • Duo Liu;Zheng Chen;Bei Hua;Nenghai Yu;Xinan Tang

  • Affiliations:
  • Southwest University of Science and Technology, University of Science and Technology of China, and Suzhou Institute for Advanced Study of USTC;University of Science and Technology of China, Hefei, P.R. China;University of Science and Technology of China and Suzhou Institute for Advanced Study of USTC, Hefei, P.R. China;University of Science and Technology of China, Hefei, P.R. China;Intel Corporation, Santa Clara, California

  • Venue:
  • ACM Transactions on Embedded Computing Systems (TECS)
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Packet classification is crucial for the Internet to provide more value-added services and guaranteed quality of service. Besides hardware-based solutions, many software-based classification algorithms have been proposed. However, classifying at 10 Gbps speed or higher is a challenging problem and it is still one of the performance bottlenecks in core routers. In general, classification algorithms face the same challenge of balancing between high classification speed and low memory requirements. This paper proposes a modified recursive flow classification (RFC) algorithm, Bitmap-RFC, which significantly reduces the memory requirements of RFC by applying a bitmap compression technique. To speed up classifying speed, we exploit the multithreaded architectural features in various algorithm development stages from algorithm design to algorithm implementation. As a result, Bitmap-RFC strikes a good balance between speed and space. It can significantly keep both high classification speed and reduce memory space consumption. This paper investigates the main NPU software design aspects that have dramatic performance impacts on any NPU-based implementations: memory space reduction, instruction selection, data allocation, task partitioning, and latency hiding. We experiment with an architecture-aware design principle to guarantee the high performance of the classification algorithm on an NPU implementation. The experimental results show that the Bitmap-RFC algorithm achieves 10 Gbps speed or higher and has a good scalability on Intel IXP2800 NPU.