High performance dictionary-based string matching for deep packet inspection

  • Authors:
  • Yi-Hua E. Yang;Hoang Le;Viktor K. Prasanna

  • Affiliations:
  • Ming Hsieh Dept. of Electrical Eng., University of Southern California;Ming Hsieh Dept. of Electrical Eng., University of Southern California;Ming Hsieh Dept. of Electrical Eng., University of Southern California

  • Venue:
  • INFOCOM'10 Proceedings of the 29th conference on Information communications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dictionary-Based String Matching (DBSM) is used in network Deep Packet Inspection (DPI) applications virus scanning [1] and network intrusion detection [2]. We propose the Pipelined Affix Search with Tail Acceleration (PASTA) architecture for solving DBSM with guaranteed worst-case performance. Our PASTA architecture is composed of a Pipelined Affix Search Relay (PASR) followed by a Tail Acceleration Finite Automaton (TAFA). PASR consists of one or more pipelined Binary Search Tree (pBST) modules arranged in a linear array. TAFA is constructed with the Aho-Corasick goto and failure functions [3] in a compact multi-path and multi-stride tree structure. Both PASR and TAFA achieve good memory efficiency of 1.2 and 2 B/ch (bytes per character) respectively and are pipelined to achieve a high clock rate of 200 MHz on FPGAs. Because PASTA does not depend on the effectiveness of any hash function or the property of the input stream, its performance is guaranteed in the worst case. Our prototype implementation of PASTA on an FPGA with 10 Mb on-chip block RAM achieves 3.2 Gbps matching throughput against a dictionary of over 700K characters. This level of performance surpasses the requirements of next-generation security gateways for deep packet inspection.