New and faster filters for multiple approximate string matching

  • Authors:
  • Ricardo Baeza-Yates;Gonzalo Navarro

  • Affiliations:
  • Department of Computer Science, University of Chile, Blanco Encalada 2120, Santiago, Chile;Department of Computer Science, University of Chile, Blanco Encalada 2120, Santiago, Chile

  • Venue:
  • Random Structures & Algorithms
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present three new algorithms for on-line multiple string matching allowing errors. These are extensions of previous algorithms that search for a single pattern. The average running time achieved is in all cases linear in the text size for moderate error level, pattern length, and number of patterns. They adapt (with higher costs) to the other cases. However, the algorithms differ in speed and thresholds of usefulness. We theoretically analyze when each algorithm should be used, and show their performance experimentally. The only previous solution for this problem allows only one error. Our algorithms are the first to allow more errors, and are faster than previous work for a moderate number of patterns (e.g. less than 50-100 on English text, depending on the pattern length).