Multipattern string matching with q-grams

  • Authors:
  • Leena Salmela;Jorma Tarhio;Jari Kytöjoki

  • Affiliations:
  • Helsinki University of Technology, Finland;Helsinki University of Technology, Finland;Helsinki University of Technology, Finland

  • Venue:
  • Journal of Experimental Algorithmics (JEA)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present three algorithms for exact string matching of multiple patterns. Our algorithms are filtering methods, which apply q-grams and bit parallelism. We ran extensive experiments with them and compared them with various versions of earlier algorithms, e.g., different trie implementations of the Aho--Corasick algorithm. All of our algorithms appeared to be substantially faster than earlier solutions for sets of 1,000--10,000 patterns and the good performance of two of them continues to 100,000 patterns. The gain is because of the improved filtering efficiency caused by q-grams.