On-Line linear-time construction of word suffix trees

Authors:
Shunsuke Inenaga;Masayuki Takeda
Affiliations:
Japan Society for the Promotion of Science;Department of Informatics, Kyushu University, Fukuoka, Japan
Venue:
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Year:
2006

Citing 12
Cited 7

Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Extracting structured motifs using a suffix tree—algorithms and application to promoter consensus identification

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Efficient string matching: an aid to bibliographic search

Communications of the ACM
Accelerating Protein Classification Using Suffix Trees

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Efficient Text Searching of Regular Expressions (Extended Abstract)

ICALP '89 Proceedings of the 16th International Colloquium on Automata, Languages and Programming
Processing Text Files as Is: Pattern Matching over Compressed Texts, Multi-byte Character Texts, and Semi-structured Texts

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
Sparse Suffix Trees

COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Extended application of suffix trees to data compression

DCC '96 Proceedings of the Conference on Data Compression
Truncated suffix trees and their application to data compression

Theoretical Computer Science
Linear pattern matching algorithms

SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Distributed and paged suffix trees for large genetic databases

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching

On-line construction of compact suffix vectors and maximal repeats

Theoretical Computer Science
Sparse and truncated suffix trees on variable-length codes

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
A new algorithm for sparse suffix trees

BSB'11 Proceedings of the 6th Brazilian conference on Advances in bioinformatics and computational biology
Sparse directed acyclic word graphs

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Full-text search on multi-byte encoded documents

Proceedings of the 2012 ACM symposium on Document engineering
Suffix arrays on words

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Sparse suffix tree construction in small space

ICALP'13 Proceedings of the 40th international conference on Automata, Languages, and Programming - Volume Part I

Quantified Score

Hi-index	0.03

Visualization

Abstract

Suffix trees are the key data structure for text string matching, and are used in wide application areas such as bioinformatics and data compression. Sparse suffix trees are kind of suffix trees that represent only a subset of suffixes of the input string. In this paper we study word suffix trees, which are one variation of sparse suffix trees. Let D be a dictionary of words and w be a string in D+, namely, w is a sequence w1 ⋯wk of k words in D. The word suffix tree of w w.r.t. D is a path-compressed trie that represents only the k suffixes in the form of wi ⋯wk. A typical example of its application is word- and phrase-level search on natural language documents. Andersson et al. proposed an algorithm to build word suffix trees in O(n) expected time with O(k) space. In this paper we present a new word suffix tree construction algorithm with O(n) running time and O(k) space in the worst cases. Our algorithm is on-line, which means that it can sequentially process the characters in the input, each by each, from left to right.