Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Information Processing Letters
High-order entropy-compressed text indexes
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Computing all repeats using suffix arrays
Journal of Automata, Languages and Combinatorics - Special issue: Selected papers of the 13th Australasian workshop on combinatorial algorithms
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
On-line construction of compact suffix vectors and maximal repeats
Theoretical Computer Science
A Linear-Time Burrows-Wheeler Transform Using Induced Sorting
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Top-k ranked document search in general text databases
ESA'10 Proceedings of the 18th annual European conference on Algorithms: Part II
Computing the longest common prefix array based on the burrows-wheeler transform
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Efficient computation of substring equivalence classes with suffix arrays
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
RACE: a scalable and elastic parallel system for discovering repeats in very long sequences
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
The identification of repetitive sequences (repeats) is an essential component of genome sequence analysis, and the notions of maximal and supermaximal repeats capture all exact repeats in a genome in a compact way. Very recently, Külekci et al. (Computational Biology and Bioinformatics, 2012) developed an algorithm for finding all maximal repeats that is very space-efficient because it uses the Burrows-Wheeler transform and wavelet trees. In this paper, we present a new space-efficient algorithm for finding maximal repeats in massive data that outperforms their algorithm both in theory and practice. The algorithm is not confined to this task, it can also be used to find all supermaximal repeats or to solve other problems space-efficiently.