Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Efficient suffix trees on secondary storage
Proceedings of the seventh annual ACM-SIAM symposium on Discrete algorithms
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Reducing the space requirement of suffix trees
Software—Practice & Experience
A Database Index to Large Biological Sequences
Proceedings of the 27th International Conference on Very Large Data Bases
Compressed Text Databases with Efficient Query Algorithms Based on the Compressed Suffix Array
ISAAC '00 Proceedings of the 11th International Conference on Algorithms and Computation
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Indexing text using the Ziv-Lempel trie
Journal of Discrete Algorithms - SPIRE 2002
ACM Computing Surveys (CSUR)
Linear work suffix array construction
Journal of the ACM (JACM)
Compressed indexes for dynamic text collections
ACM Transactions on Algorithms (TALG)
Theoretical Computer Science
Fast BWT in small space by blockwise suffix sorting
Theoretical Computer Science
Better external memory suffix array construction
Journal of Experimental Algorithmics (JEA)
Compressed text indexes: From theory to practice
Journal of Experimental Algorithmics (JEA)
Space-efficient construction of LZ-index
ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology
Memory-Aware BWT by segmenting sequences to support subsequence search
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Optimal lightweight construction of suffix arrays for constant alphabets
WADS'07 Proceedings of the 10th international conference on Algorithms and Data Structures
Hi-index | 0.00 |
With the first Human DNA being decoded into a sequence of about 2.8 billion base pairs, many biological research has been centered on analyzing this sequence. Theoretically speaking, it is now feasible to accommodate an index for human DNA in main memory so that any pattern can be located efficiently. This is due to the recent breakthrough on compressed suffix arrays, which reduces the space requirement from O(n log n) bits to O(n) bits. However, constructing compressed suffix arrays is still not an easy task because we still have to compute suffix arrays first and need a working memory of O(n log n) bits (i.e., more than 13 Gigabytes for human DNA). This paper initiates the study of constructing compressed suffix arrays directly from text. The main contribution is a new construction algorithm that uses only O(n) bits of working memory, and more importantly, the time complexity remains the same as before, i.e., O(n log n).