Indexing with gaps

Authors:
Moshe Lewenstein
Affiliations:
Department of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
Venue:
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Year:
2011

Citing 18
Cited 1

Fast algorithms for finding nearest common ancestors

SIAM Journal on Computing
On finding lowest common ancestors: simplification and parallelization

SIAM Journal on Computing
Multi-method dispatching: a geometric approach with applications to string matching problems

STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
On the sorting-complexity of suffix tree construction

Journal of the ACM (JACM)
Text indexing and dictionary matching with one error

Journal of Algorithms
Verifying candidate matches in sparse and wildcard matching

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Efficient pattern-matching with don't cares

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Approximate string matching with gaps

Nordic Journal of Computing
Faster Algorithms for String Matching Problems: Matching the Convolution Bound

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Dictionary matching and indexing with errors and don't cares

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Dynamic text and static pattern matching

ACM Transactions on Algorithms (TALG)
Linear pattern matching algorithms

SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Indexing Factors with Gaps

Algorithmica
Space efficient indexes for string matching with don't cares

ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
String matching with variable length gaps

SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Substring range reporting

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Self-normalised distance with don't cares

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

String indexing for patterns with wildcards

SWAT'12 Proceedings of the 13th Scandinavian conference on Algorithm Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

In Indexing with Gaps one seeks to index a text to allow pattern queries that allow gaps within the pattern query. Formally a gappedpattern over alphabet Σ is a pattern of the form p = p1g1p2g2 ... glpl+1, where ∀i, pi ∈ Σ* and each gi is a gap length ∈ N. Often one considers these patterns with some bound constraints, for example, all gaps are bounded by a gap-bound G. Near-optimal solutions have, lately, been proposed for the case of one gap only with a predetermined size. More specifically, an indexing solution for patterns of the form p1 ċ g ċ p2, where g is known apriori. In this case the solutions mentioned are preprocessed in O(n log∈ n) time and O(n) space, where the pattern queries are answered in O(|p1| + |p2|), for constant sized alphabets. For the more general case when there is a bound G these results can be easily adapted with a multiplicative factor of O(G) for the preprocessing, i.e. O(n log∈ nG) preprocessing time and O(nG) preprocessing space. Alas, these solutions do not lend to more than one gap. In this paper we propose a solution for k gaps one with preprocessing time O(nG2k logk n log log n) and space of O(nG2k logk n) and query time O(m + 2k log log n), where m = Σi=1 |pi|.