Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
An Efficient Method for in Memory Construction of Suffix Arrays
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
A Fast Algorithms for Making Suffix Arrays and for Burrows-Wheeler Transformation
DCC '98 Proceedings of the Conference on Data Compression
Breaking a Time-and-Space Barrier in Constructing Full-Text Indices
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Linear work suffix array construction
Journal of the ACM (JACM)
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
Better external memory suffix array construction
Journal of Experimental Algorithmics (JEA)
A Linear-Time Burrows-Wheeler Transform Using Induced Sorting
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Fast lightweight suffix array construction and checking
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
WADS'11 Proceedings of the 12th international conference on Algorithms and data structures
Two Efficient Algorithms for Linear Time Suffix Array Construction
IEEE Transactions on Computers
Lightweight Data Indexing and Compression in External Memory
Algorithmica - Special Issue: Theoretical Informatics
ICALP'07 Proceedings of the 34th international conference on Automata, Languages and Programming
Hi-index | 0.00 |
This article presents an O(n)-time algorithm called SACA-K for sorting the suffixes of an input string T[0, n-1] over an alphabet A[0, K-1]. The problem of sorting the suffixes of T is also known as constructing the suffix array (SA) for T. The theoretical memory usage of SACA-K is n log K + n log n + K log n bits. Moreover, we also have a practical implementation for SACA-K that uses n bytes + (n + 256) words and is suitable for strings over any alphabet up to full ASCII, where a word is log n bits. In our experiment, SACA-K outperforms SA-IS that was previously the most time- and space-efficient linear-time SA construction algorithm (SACA). SACA-K is around 33% faster and uses a smaller deterministic workspace of K words, where the workspace is the space needed beyond the input string and the output SA. Given K=O(1), SACA-K runs in linear time and O(1) workspace. To the best of our knowledge, such a result is the first reported in the literature with a practical source code publicly available.