A Black Box for Online Approximate Pattern Matching

  • Authors:
  • Raphaël Clifford;Klim Efremenko;Benny Porat;Ely Porat

  • Affiliations:
  • Dept. of Computer Science, University of Bristol, Bristol, UK BS8 1UB;Dept. of Computer Science, Dept. of Computer Science and Applied Mathematics, Bar-Ilan University, Weizman Institute, Ramat-Gan, Israel 52900;Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel 52900;Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel 52900

  • Venue:
  • CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a deterministic black box solution for online approximate matching. Given a pattern of length mand a streaming text of length nthat arrives one character at a time, the task is to report the distance between the pattern and a sliding window of the text as soon as the new character arrives. Our solution requires $O(\Sigma_{j=1}^{\log_2{m}} T(n,2^{j-1})/n)$ time for each input character, where T(n,m) is the total running time of the best offline algorithm. The types of approximation that are supported include exact matching with wildcards, matching under the Hamming norm, approximating the Hamming norm, k-mismatch and numerical measures such as the L2and L1norms. For these examples, the resulting online algorithms take O(log2m), $O(\sqrt{m\log{m}})$, O(log2m/茂戮驴2), $O(\sqrt{k \log k} \log{m})$, O(log2m) and $O(\sqrt{m\log{m}})$ time per character respectively. The space overhead is O(m) which we show is optimal.