Fast algorithms for finding nearest common ancestors
SIAM Journal on Computing
The input/output complexity of sorting and related problems
Communications of the ACM
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Symmetry breaking for suffix tree construction
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Real-time pattern matching and quasi-real-time construction of suffix trees (preliminary version)
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Greed sort: optimal deterministic sorting on parallel disks
Journal of the ACM (JACM)
Large-scale assembly of DNA strings and space-efficient construction of suffix trees
STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Large-scale assembly of DNA strings and space-efficient construction of suffix trees
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
On sorting strings in external memory (extended abstract)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Optimal parallel suffix tree construction
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Performance modeling for realistic storage devices
Performance modeling for realistic storage devices
The string B-tree: a new data structure for string search in external memory and its applications
Journal of the ACM (JACM)
External-memory graph algorithms
Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Faster deterministic sorting and priority queues in linear space
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Optimal Logarithmic Time Randomized Suffix Tree Construction
ICALP '96 Proceedings of the 23rd International Colloquium on Automata, Languages and Programming
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Overcoming the Memory Bottleneck in Suffix Tree Construction
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Engineering a Lightweight Suffix Array Construction Algorithm
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Generalizations of suffix arrays to multi-dimensional matrices
Theoretical Computer Science
Generalizations of suffix arrays to multi-dimensional matrices
Theoretical Computer Science
The suffix binary search tree and suffix AVL tree
Journal of Discrete Algorithms
Constructing Suffix Tree for Gigabyte Sequences with Megabyte Memory
IEEE Transactions on Knowledge and Data Engineering
Practical methods for constructing suffix trees
The VLDB Journal — The International Journal on Very Large Data Bases
Cache-oblivious string dictionaries
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
ACM Computing Surveys (CSUR)
Linear work suffix array construction
Journal of the ACM (JACM)
Linear time algorithm for the longest common repeat problem
Journal of Discrete Algorithms
Genome-scale disk-based suffix tree indexing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Theoretical Computer Science
Practical suffix tree construction
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
PSIST: A scalable approach to indexing protein structures using suffix trees
Journal of Parallel and Distributed Computing
Algorithms and data structures for external memory
Foundations and Trends® in Theoretical Computer Science
Better external memory suffix array construction
Journal of Experimental Algorithmics (JEA)
Improving on-line construction of two-dimensional suffix trees for square matrices
Information Processing Letters
Reducing Space Requirements for Disk Resident Suffix Arrays
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Efficient construction of maximal and minimal representations of motifs of a string
Theoretical Computer Science
On-Line Construction of Parameterized Suffix Trees
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Suffix trees for very large genomic sequences
Proceedings of the 18th ACM conference on Information and knowledge management
Suffix tree construction algorithms on modern hardware
Proceedings of the 13th International Conference on Extending Database Technology
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Optimal self-adjusting trees for dynamic string data in secondary storage
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Efficient indexing algorithms for one-dimensional discretely-scaled strings
Information Processing Letters
I/O efficient algorithms for serial and parallel suffix tree construction
ACM Transactions on Database Systems (TODS)
Algorithm engineering: bridging the gap between algorithm theory and practice
Algorithm engineering: bridging the gap between algorithm theory and practice
On-line construction of parameterized suffix trees for large alphabets
Information Processing Letters
The indexing for one-dimensional proportionally-scaled strings
Information Processing Letters
Suffix trees for inputs larger than main memory
Information Systems
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Lossless fault-tolerant data structures with additive overhead
WADS'11 Proceedings of the 12th international conference on Algorithms and data structures
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
On suffix extensions in suffix trees
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
A new efficient indexing algorithm for one-dimensional real scaled patterns
Journal of Computer and System Sciences
External string sorting: faster and cache-oblivious
STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Obtaining provably good performance from suffix trees in secondary storage
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
O(n2 log n) time on-line construction of two-dimensional suffix trees
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Lightweight data indexing and compression in external memory
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Time and space efficient search for small alphabets with suffix arrays
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Online and dynamic recognition of squarefree strings
MFCS'05 Proceedings of the 30th international conference on Mathematical Foundations of Computer Science
On demand string sorting over unbounded alphabets
Theoretical Computer Science
Indexing a dictionary for subset matching queries
Algorithms and Applications
Linear time algorithm for the generalised longest common repeat problem
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Efficient retrieval of approximate palindromes in a run-length encoded string
Theoretical Computer Science
Longest common extensions via fingerprinting
LATA'12 Proceedings of the 6th international conference on Language and Automata Theory and Applications
Self-Indexed Grammar-Based Compression
Fundamenta Informaticae
On suffix extensions in suffix trees
Theoretical Computer Science
On demand string sorting over unbounded alphabets
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
A simple construction of two-dimensional suffix trees in linear time
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Near real-time suffix tree construction via the fringe marked ancestor problem
Journal of Discrete Algorithms
Efficient parallel construction of suffix trees for genomes larger than main memory
Proceedings of the 20th European MPI Users' Group Meeting
Efficient techniques on retrieving bio-information for active U-healthcare
Personal and Ubiquitous Computing
Hi-index | 0.01 |
The suffix tree of a string is the fundamental data structure of combinatorial pattern matching. We present a recursive technique for building suffix trees that yields optimal algorithms in different computational models. Sorting is an inherent bottleneck in building suffix trees and our algorithms match the sorting lower bound. Specifically, we present the following results. (1) Weiner [1973], who introduced the data structure, gave an optimal 0(n)-time algorithm for building the suffix tree of an n-character string drawn from a constant-size alphabet. In the comparison model, there is a trivial &Ogr;(n log n)-time lower bound based on sorting, and Weiner's algorithm matches this bound. For integer alphabets, the fastest known algorithm is the O(n log n)time comparison-based algorithm, but no super-linear lower bound is known. Closing this gap is the main open question in stringology. We settle this open problem by giving a linear time reduction to sorting for building suffix trees. Since sorting is a lower-bound for building suffix trees, this algorithm is time-optimal in every alphabet mode. In particular, for an alphabet consisting of integers in a polynomial range we get the first known linear-time algorithm. (2) All previously known algorithms for building suffix trees exhibit a marked absence of locality of reference, and thus they tend to elicit many page faults (I/Os) when indexing very long strings. They are therefore unsuitable for building suffix trees in secondary storage devices, where I/Os dominate the overall computational cost. We give a linear-I/O reduction to sorting for suffix tree construction. Since sorting is a trivial I/O-lower bound for building suffix trees, our algorithm is I/O-optimal.