Tuning string matching for huge pattern sets

  • Authors:
  • Jari Kytöjoki;Leena Salmela;Jorma Tarhio

  • Affiliations:
  • Department of Computer Science and Engineering, Helsinki University of Technology, Finland;Department of Computer Science and Engineering, Helsinki University of Technology, Finland;Department of Computer Science and Engineering, Helsinki University of Technology, Finland

  • Venue:
  • CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present three algorithms for exact string matching of multiple patterns. Our algorithms are filtering methods, which apply q- grams and bit parallelism. We ran extensive experiments with them and compared them with various versions of earlier algorithms, e.g. different trie implementations of the Aho-Corasick algorithm. Our algorithms showed to be substantially faster than earlier solutions for sets of 1,000- 100,000 patterns. The gain is due to the improved filtering efficiency caused by q-grams.