Sparse and truncated suffix trees on variable-length codes

Authors:
Takashi Uemura;Hiroki Arimura
Affiliations:
Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan;Graduate School of Information Science and Technology, Hokkaido University, Sapporo, Japan
Venue:
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Year:
2011

Citing 11
Cited 1

Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Processing Text Files as Is: Pattern Matching over Compressed Texts, Multi-byte Character Texts, and Semi-structured Texts

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Sparse Suffix Trees

COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications

CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Truncated suffix trees and their application to data compression

Theoretical Computer Science
Replacing suffix trees with enhanced suffix arrays

Journal of Discrete Algorithms - SPIRE 2002
On-Line linear-time construction of word suffix trees

CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Property matching and weighted matching

CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Sparse directed acyclic word graphs

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Suffix arrays on words

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

Sparse suffix tree construction in small space

ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The sparse suffix trees (SST), introduced by (Kärkkäinen and Ukkonen, COCOON 1996), is the suffix tree for a subset of all suffixes of an input text T of length n. In this paper, we study a special case that an input string is a sequence of k codewords drawn from a regular prefix code Δ ⊆ Σ+ recognized by a finite automaton, and index points locate on the code boundaries. In this case, we present an online algorithm that constructs the sparse suffix tree for an input string T on any variable-length regular prefix code, called the code suffix tree (CST), in O(n + m) time and O(k) additional space for a fixed base alphabet Σ, where m is the size of an automaton for Δ. Furthermore, we present a modified algorithm for l-truncated version of code suffix trees that runs in the same time and space complexities. Hence, these results generalize the previous results (Inenaga and Takeda, CPM 2006) for word suffix trees and (Na, Apostolico, Iliopoulos, and Park, Theor. Comp. Sci., 304, 2003) for truncated suffix trees on arbitrary variable-length regular prefix codes, such as Huffman codes and multi-byte codes (e.g. UTF-8).