Compressed Suffix Arrays for Massive Data
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
A Linear-Time Burrows-Wheeler Transform Using Induced Sorting
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Scalability of communicators and groups in MPI
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Space-efficient construction of Lempel-Ziv compressed text indexes
Information and Computation
ACM Transactions on Algorithms (TALG)
Word-based self-indexes for natural language text
ACM Transactions on Information Systems (TOIS)
Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Lightweight data indexing and compression in external memory
LATIN'10 Proceedings of the 9th Latin American conference on Theoretical Informatics
Memory-Aware BWT by segmenting sequences to support subsequence search
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Efficient algorithm for circular burrows-wheeler transform
CPM'12 Proceedings of the 23rd Annual conference on Combinatorial Pattern Matching
On compressing and indexing repetitive sequences
Theoretical Computer Science
Hi-index | 0.00 |
Suffix trees and suffix arrays are the most prominent full-text indices, and their construction algorithms are well studied. In the literature, the fastest algorithm runs in $O(n)$ time, while it requires $O(n\log n)$-bit working space, where $n$ denotes the length of the text. On the other hand, the most space-efficient algorithm requires $O(n)$-bit working space while it runs in $O(n\log n)$ time. It was open whether these indices can be constructed in both $o(n\log n)$ time and $o(n\log n)$-bit working space. This paper breaks the above time-and-space barrier under the unit-cost word RAM. We give an algorithm for constructing the suffix array, which takes $O(n)$ time and $O(n)$-bit working space, for texts with constant-size alphabets. Note that both the time and the space bounds are optimal. For constructing the suffix tree, our algorithm requires $O(n\log^{\epsilon}n)$ time and $O(n)$-bit working space for any $0