Data compression with finite windows
Communications of the ACM
Self-alignments in words and their applications
Journal of Algorithms
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Reducing the space requirement of suffix trees
Software—Practice & Experience
Database indexing for large DNA and protein sequence collections
The VLDB Journal — The International Journal on Very Large Data Bases
Linear Bidirectional On-Line Construction of Affix Trees
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
Extended application of suffix trees to data compression
DCC '96 Proceedings of the Conference on Data Compression
Optimal suffix tree construction with large alphabets
FOCS '97 Proceedings of the 38th Annual Symposium on Foundations of Computer Science
Compact directed acyclic word graphs for a sliding window
Journal of Discrete Algorithms - SPIRE 2002
Constructing Suffix Tree for Gigabyte Sequences with Megabyte Memory
IEEE Transactions on Knowledge and Data Engineering
Practical methods for constructing suffix trees
The VLDB Journal — The International Journal on Very Large Data Bases
DCC '06 Proceedings of the Data Compression Conference
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
Genome-scale disk-based suffix tree indexing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
Dynamic extended suffix arrays
Journal of Discrete Algorithms
Hi-index | 0.00 |
Classical suffix tree construction algorithms by McCreight and Ukkonen spend most of the time looking up the right branch to follow from the current node. However, not all these slow branching operations are necessary. A significant portion of them is used for implicit suffix link simulation and can be avoided by replacing the traditional top-down descent with bottom-up climbing. We describe the bottom-up approach and analyze its costs and benefits. An experimental evaluation on two standard data corpora shows that bottom-up climbing removes forty to sixty six percent of branching operations and consequently saves twenty one to thirty two percent of construction time. However, a theoretical analysis of the worst-case behavior reveals that the time complexity of the bottom-up approach is superlinear. This is remedied by a combination of both approaches that removes nearly as many branching operations as the bottom-up climb, but still runs in linear time like the top-down descent.