Fast and cache-oblivious dynamic programming with local dependencies
LATA'12 Proceedings of the 6th international conference on Language and Automata Theory and Applications
Hi-index | 0.00 |
We study the classical approximate string matching problem, that is, given strings P and Q and an error threshold k, find all ending positions of substrings of Q whose edit distance to P is at most k. Let P and Q have lengths m and n, respectively. On a standard unit-cost word RAM with word size w≥log n we present an algorithm using time $$O\biggl(nk \cdot \min\biggl(\frac{\log^2 m}{\log n},\frac{\log^2 m\log w}{w}\biggr) + n\biggr)$$ When P is short, namely, $m = 2^{o(\sqrt{\log n}\,)}$ or $m =2^{o(\sqrt{w/\log w}\,)}$ this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.