A black box for online approximate pattern matching

  • Authors:
  • Raphaël Clifford;Klim Efremenko;Benny Porat;Ely Porat

  • Affiliations:
  • University of Bristol, Dept. of Computer Science, Bristol BS8 1UB, UK;Bar-Ilan University, Dept. of Computer Science, 52900 Ramat-Gan, Israel and Weizman Institute, Dept. of Computer Science and Applied Mathematics, Rehovot, Israel;Bar-Ilan University, Dept. of Computer Science, 52900 Ramat-Gan, Israel;Bar-Ilan University, Dept. of Computer Science, 52900 Ramat-Gan, Israel

  • Venue:
  • Information and Computation
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a deterministic black box solution for online approximate matching. Given a pattern of length m and a streaming text of length n that arrives one character at a time, the task is to report the distance between the pattern and a sliding window of the text as soon as the new character arrives. Our solution requires O(@S"j"="1^l^o^g^"^2^mT(n,2^j^-^1)/n) time for each input character, where T(n,m) is the total running time of the best offline algorithm. The types of approximation that are supported include exact matching with wildcards, matching under the Hamming norm, approximating the Hamming norm, k-mismatch and numerical measures such as the L"2 and L"1 norms. For these examples, the resulting online algorithms take O(log^2m), O(mlogm), O(log^2m/@e^2), O(klogklogm), O(log^2m) and O(mlogm) time per character, respectively. The space overhead is linear in the pattern size, which we show is optimal for any deterministic algorithm.