Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Approximate Boyer-Moore string matching
SIAM Journal on Computing
Self-testing/correcting with applications to numerical problems
Journal of Computer and System Sciences - Special issue: papers from the 22nd ACM symposium on the theory of computing, May 14–16, 1990
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Approximate String Matching and Local Similarity
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Algorithms column: sublinear time algorithms
ACM SIGACT News
Average-optimal single and multiple approximate string matching
Journal of Experimental Algorithmics (JEA)
Large deviations for sums of partly dependent random variables
Random Structures & Algorithms - Isaac Newton Institute Programme “Computation, Combinatorics and Probability”: Part I
Approximate string matching in sublinear expected time
SFCS '90 Proceedings of the 31st Annual Symposium on Foundations of Computer Science
On-Line Approximate String Matching with Bounded Errors
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Concentration of Measure for the Analysis of Randomized Algorithms
Concentration of Measure for the Analysis of Randomized Algorithms
Hi-index | 5.23 |
We introduce a new dimension to the widely studied on-line approximate string matching problem, by introducing an error threshold parameter @e so that the algorithm is allowed to miss occurrences with probability @e. This is particularly appropriate for this problem, as approximate searching is used to model many cases where exact answers are not mandatory. We show that the relaxed version of the problem allows us breaking the average-case optimal lower bound of the classical problem, achieving average case O(nlog"@sm/m) time with any @e=poly(k/m), where n is the text size, m the pattern length, k the number of differences for edit distance, and @s the alphabet size. Our experimental results show the practicality of this novel and promising research direction. Finally, we extend the proposed approach to the multiple approximate string matching setting, where the approximate occurrence of r patterns are simultaneously sought. Again, we can break the average-case optimal lower bound of the classical problem, achieving average case O(nlog"@s(rm)/m) time with any @e=poly(k/m).