Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric
Journal of the ACM (JACM)
COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Efficient randomized pattern-matching algorithms
IBM Journal of Research and Development - Mathematics and computing
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Sorting networks and their applications
AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Fast lightweight suffix array construction and checking
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Sparse and truncated suffix trees on variable-length codes
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Pattern Matching on Sparse Suffix Trees
CCP '11 Proceedings of the 2011 First International Conference on Data Compression, Communications and Processing
On-Line linear-time construction of word suffix trees
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
We consider the problem of constructing a sparse suffix tree (or suffix array) for b suffixes of a given text T of length n, using only O(b) words of space during construction. Attempts at breaking the naive bound of Ω(nb) time for this problem can be traced back to the origins of string indexing in 1968. First results were only obtained in 1996, but only for the case where the suffixes were evenly spaced in T. In this paper there is no constraint on the locations of the suffixes. We show that the sparse suffix tree can be constructed in O(nlog2b) time. To achieve this we develop a technique, which may be of independent interest, that allows to efficiently answer b longest common prefix queries on suffixes of T, using only O(b) space. We expect that this technique will prove useful in many other applications in which space usage is a concern. Our first solution is Monte-Carlo and outputs the correct tree with high probability. We then give a Las-Vegas algorithm which also uses O(b) space and runs in the same time bounds with high probability when $b = O(\sqrt{n})$. Furthermore, additional tradeoffs between the space usage and the construction time for the Monte-Carlo algorithm are given.