Sparse suffix tree construction in small space

Authors:
Philip Bille;Johannes Fischer;Inge Li Gørtz;Tsvi Kopelowitz;Benjamin Sach;Hjalte Wedel Vildhøj
Affiliations:
Technical University of Denmark, Denmark;Institute of Theoretical Informatics, KIT, Germany;Technical University of Denmark, Denmark;Weizmann Institute of Science, Israel;University of Warwick, UK;Technical University of Denmark, Denmark
Venue:
ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I
Year:
2013

Citing 15
Cited 0

Suffix arrays: a new method for on-line string searches

SIAM Journal on Computing
Fast algorithms for sorting and searching strings

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
PATRICIA—Practical Algorithm To Retrieve Information Coded in Alphanumeric

Journal of the ACM (JACM)
Sparse Suffix Trees

COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Suffix Trees on Words

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
An 0(n log n) sorting network

STOC '83 Proceedings of the fifteenth annual ACM symposium on Theory of computing
Efficient randomized pattern-matching algorithms

IBM Journal of Research and Development - Mathematics and computing
Linear pattern matching algorithms

SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Sorting networks and their applications

AFIPS '68 (Spring) Proceedings of the April 30--May 2, 1968, spring joint computer conference
Fast lightweight suffix array construction and checking

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Sparse and truncated suffix trees on variable-length codes

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Pattern Matching on Sparse Suffix Trees

CCP '11 Proceedings of the 2011 First International Conference on Data Compression, Communications and Processing
On-Line linear-time construction of word suffix trees

CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Suffix arrays on words

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of constructing a sparse suffix tree (or suffix array) for b suffixes of a given text T of length n, using only O(b) words of space during construction. Attempts at breaking the naive bound of Ω(nb) time for this problem can be traced back to the origins of string indexing in 1968. First results were only obtained in 1996, but only for the case where the suffixes were evenly spaced in T. In this paper there is no constraint on the locations of the suffixes. We show that the sparse suffix tree can be constructed in O(nlog2b) time. To achieve this we develop a technique, which may be of independent interest, that allows to efficiently answer b longest common prefix queries on suffixes of T, using only O(b) space. We expect that this technique will prove useful in many other applications in which space usage is a concern. Our first solution is Monte-Carlo and outputs the correct tree with high probability. We then give a Las-Vegas algorithm which also uses O(b) space and runs in the same time bounds with high probability when $b = O(\sqrt{n})$. Furthermore, additional tradeoffs between the space usage and the construction time for the Monte-Carlo algorithm are given.