Self-alignments in words and their applications
Journal of Algorithms
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Dictionary matching and indexing with errors and don't cares
STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
An inexact-suffix-tree-based algorithm for detecting extensible patterns
Theoretical Computer Science - Pattern discovery in the post genome
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Approximate string matching with Lempel-Ziv compressed indexes
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Indexing methods for approximate dictionary searching: Comparative analysis
Journal of Experimental Algorithmics (JEA)
String indexing for patterns with wildcards
SWAT'12 Proceedings of the 13th Scandinavian conference on Algorithm Theory
Hi-index | 0.02 |
In this work, the problem we address is text indexing for approximate matching. Given a text $\mathcal{T}$ which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors k (edition distance). The indexing structure occupies space $\mathcal{O}(n\log^kn)$ in the average case, independent of alphabet size. This structure can be used to report the existence of a match with k errors in $\mathcal{O}(3^k m^{k+1})$ and to report the occurrences in $\mathcal{O}(3^k m^{k+1} + \mbox{\it ed})$ time, where m is the length of the pattern and ed and the number of matching edit scripts. The construction of the structure has time bound by $\mathcal{O}(kN|\Sigma|)$, where N is the number of nodes in the index and |Σ| the alphabet size.