On-Line Approximate String Matching with Bounded Errors

  • Authors:
  • Marcos Kiwi;Gonzalo Navarro;Claudio Telha

  • Affiliations:
  • Departamento de Ingeniería Matemática, Centro de Modelamiento Matemático UMI 2807 CNRS-UChile,;Department of Computer Science, University of Chile,;Operations Research Center, MIT,

  • Venue:
  • CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a new dimension to the widely studied on-line approximate string matching problem, by introducing an error thresholdparameter 茂戮驴so that the algorithm is allowed to miss occurrences with probability 茂戮驴. This is particularly appropriate for this problem, as approximate searching is used to model many cases where exact answers are not mandatory. We show that the relaxed version of the problem allows us breaking the average-case optimal lower bound of the classical problem, achieving average case O(nlog茂戮驴m/m) time with any $\epsilon = \textrm{poly}(k/m)$, where nis the text size, mthe pattern length, kthe number of errors for edit distance, and 茂戮驴the alphabet size. Our experimental results show the practicality of this novel and promising research direction.