Introduction to algorithms
Journal of the ACM (JACM)
Efficient 2-dimensional approximate matching of half-rectangular figures
Information and Computation
Tree pattern matching and subset matching in randomized O(nlog3m) time
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Journal of Algorithms
Verifying candidate matches in sparse and wildcard matching
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Efficient tree pattern matching
SFCS '89 Proceedings of the 30th Annual Symposium on Foundations of Computer Science
The Practical Efficiency of Convolutions in Pattern Matching Algorithms
Fundamenta Informaticae - Workshop on Combinatorial Algorithms
Journal of Discrete Algorithms
Polynomial-time approximation algorithms for weighted LCS problem
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Weighted shortest common supersequence
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
The Practical Efficiency of Convolutions in Pattern Matching Algorithms
Fundamenta Informaticae - Workshop on Combinatorial Algorithms
Hi-index | 0.00 |
Weighted sequences have been recently introduced as a tool to handle a set of sequences that are not identical but have many local similarities. The weighted sequence is a “statistical image” of this set, where the probability of every symbol's occurrence at every text location is given. We address the problem of approximately matching a pattern in such a weighted sequence. The pattern is a given string and we seek all locations in the set where the pattern occurs with a high enough probability. We define the notion of Hamming distance and edit distance in weighted sequences and give efficient algorithms for computing them. We compute two versions of the Hamming distance in time $O(n \sqrt{m\log m})$, where n is the length of the weighted text and m is the pattern length. The edit distance is computed in time O(nm) and O(nm2), depending on the edit distance definition used. Unfortunately, due to space considerations, the edit distance details are left to the journal version. We also define the notion of weighted matching in infinite alphabets and show that exact weighted matching can be computed in time O(slog2s), where s is the number of text symbols having non-zero probability. The weighted Hamming distance over infinite alphabets can be computed in time $\min(O(kn\sqrt{s}+s^{3/2}\log^2s), O(s^{4/3}m^{1/3}\log s))$.