Exact multi-pattern string matching on the cell/b.e. processor

  • Authors:
  • Daniele Paolo Scarpazza;Oreste Villa;Fabrizio Petrini

  • Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY, USA;Politecnico di Milano, Milano, Italy;IBM T.J. Watson Research Center, Yorktown Heights, NY, USA

  • Venue:
  • Proceedings of the 5th conference on Computing frontiers
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

String searching is the computationally intensive kernel of many security and network applications like search engines, intrusion detection systems, virus scanners and spam filters. The growing size of on-line content and the increasing wire speeds push the need for fast --and often real-time, string searching solutions. Multi-core processors are gaining increasing popularity, thanks to their unprecedented computing power, but they are also bringing new programming challenges. This paper describes a class of high-performance exact string searching solutions that we have optimized for the IBM Cell/B.E. processor using the well known Aho-Corasick algorithm. This class provides several trade-offs between performance and dictionary size. When dictionaries are small enough to fit in the local memories of the processing cores, the throughput reaches 40 Gbps per processor. With larger dictionaries (as many as hundreds of thousands patterns), the typical throughput is between 1.6 and 2.2 Gbps per processor.