An O(n log n) algorithm for finding all repetitions in a string
Journal of Algorithms
Theoretical Computer Science
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Identifying satellites in nucleic acid sequences
RECOMB '98 Proceedings of the second annual international conference on Computational molecular biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Journal of the ACM (JACM)
LEDA: a platform for combinatorial and geometric computing
LEDA: a platform for combinatorial and geometric computing
An Algorithm for Approximate Tandem Repeats
CPM '93 Proceedings of the 4th Annual Symposium on Combinatorial Pattern Matching
Computation of Squares in a String (Preliminary Version)
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Simple and Flexible Detection of Contiguous Repeats Using a Suffix Tree (Preliminary Version)
CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
A dichromatic framework for balanced trees
SFCS '78 Proceedings of the 19th Annual Symposium on Foundations of Computer Science
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Finding Maximal Quasiperiodicities in Strings
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
On hairpin-free words and languages
DLT'05 Proceedings of the 9th international conference on Developments in Language Theory
A Formal Language Analysis of DNA Hairpin Structures
Fundamenta Informaticae
Computing the maximal-exponent repeats of an overlap-free string in linear time
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
RACE: a scalable and elastic parallel system for discovering repeats in very long sequences
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
A pair in a string is the occurrence of the same substring twice. A pair is maximal if the two occurrences of the substring cannot be extended to the left and right without making them different. The gap of a pair is the number of characters between the two occurrences of the substring. In this paper we present methods for finding all maximal pairs under various constraints on the gap. In a string of length n we can find all maximal pairs with gap in an upper and lower bounded interval in time O(n log n+z) where z is the number of reported pairs. If the upper bound is removed the time reduces to O(n+z). Since a tandem repeat is a pair where the gap is zero, our methods can be seen as a generalization of finding tandem repeats. The running time of our methods equals the running time of well known methods for finding tandem repeats.