The Weighted Suffix Tree: An Efficient Data Structure for Handling Molecular Weighted Sequences and its Applications

Authors:
Costas S. Iliopoulos;Christos Makris;Yannis Panagis;Katerina Perdikuri;Evangelos Theodoridis;Athanasios Tsakalidis
Affiliations:
Department of Computer Science, King's College London, Strand, London WC2R2LS, England. E-mail: csi@dcs.kcl.ac.uk;Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece. E-mail: {makri,panagis,perdikur, theodori}@ceid.upatras.gr;Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece. E-mail: {makri,panagis,perdikur, theodori}@ceid.upatras.gr;Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece. E-mail: {makri,panagis,perdikur, theodori}@ceid.upatras.gr;Department of Computer Engineering and Informatics, University of Patras, 26504 Patras, Greece. E-mail: {makri,panagis,perdikur, theodori}@ceid.upatras.gr;Research Academic Computer Technology Institute, N. Kazantzaki Str., Rio 26504 Patras, Greece. E-mail: tsak@cti.gr
Venue:
Fundamenta Informaticae
Year:
2006

Citing 10
Cited 5

Optimal superprimitivity testing for strings

Information Processing Letters
The power of amnesia: learning probabilistic automata with variable memory length

Machine Learning - Special issue on COLT '94
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Extracting structured motifs using a suffix tree—algorithms and application to promoter consensus identification

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
A Statistical Method for Finding Transcription Factor Binding Sites

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Simple and Flexible Detection of Contiguous Repeats Using a Suffix Tree (Preliminary Version)

CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching
Finding Maximal Repetitions in a Word in Linear Time

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
MISAE: A New Approach for Regulatory Motif Extraction

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Identification of DNA regulatory motifs using Bayesian variable selection

Bioinformatics

Algorithms for extracting motifs from biological weighted sequences

Journal of Discrete Algorithms
Computing the λ-covers of a string

Information Sciences: an International Journal
A web page usage prediction scheme using sequence indexing and clustering techniques

Data & Knowledge Engineering
A web-page usage prediction scheme using weighted suffix trees

SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Polynomial-time approximation algorithms for weighted LCS problem

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we introduce the Weighted Suffix Tree, an efficient data structure for computing string regularities in weighted sequences of molecular data. Molecular Weighted Sequences can model important biological processes such as the DNA Assembly Process or the DNA-Protein Binding Process. Thus pattern matching or identification of repeated patterns, in biological weighted sequences is a very important procedure in the translation of gene expression and regulation. We present time and space efficient algorithms for constructing the weighted suffix tree and some applications of the proposed data structure to problems taken from the Molecular Biology area such as pattern matching, repeats discovery, discovery of the longest common subsequence of two weighted sequences and computation of covers.