A parallel algorithm for the extraction of structured motifs

Authors:
Alexandra M. Carvalho;Arlindo L. Oliveira;Ana T. Freitas;Marie-France Sagot
Affiliations:
INESC-ID, Rua Alves Redol, Lisboa, Portugal;IST/INESC-ID, Rua Alves Redol, Lisboa, Portugal;IST/INESC-ID, Rua Alves Redol, Lisboa, Portugal;Inria Rhône-Alpes, Université Claude Bernarde, Villeurbanne Cedex, France
Venue:
Proceedings of the 2004 ACM symposium on Applied computing
Year:
2004

Citing 8
Cited 6

Introduction to algorithms

Introduction to algorithms
Optimal parallel suffix tree construction

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Algorithms on strings, trees, and sequences: computer science and computational biology

Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
An Exact Method for Finding Short Motifs in Sequences, with Application to the Ribosome Binding Site Problem

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
A Database Index to Large Biological Sequences

Proceedings of the 27th International Conference on Very Large Data Bases
Spelling Approximate Repeated or Common Motifs Using a Suffix Tree

LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics

Practical methods for constructing suffix trees

The VLDB Journal — The International Journal on Very Large Data Bases
Genome-scale disk-based suffix tree indexing

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Practical suffix tree construction

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
The generalised k-Truncated Suffix Tree for time-and space-efficient searches in multiple DNA or protein sequences

International Journal of Bioinformatics Research and Applications
Suffix tree construction algorithms on modern hardware

Proceedings of the 13th International Conference on Extending Database Technology
Parallel motif extraction from very long sequences

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work we propose a parallel algorithm for the efficient extraction of binding-site consensus from genomic sequences. This algorithm, based on an existing approach, extracts structured motifs, that consist of an ordered collection of p ≥ 1 boxes with sizes and spacings between them specified by given parameters. The contents of the boxes, which represent the extracted motifs, are unknown at the start of the process and are found by the algorithm using a suffix tree as the fundamental data structure. By partitioning the structured motif searching space we divide the most demanding part of the algorithm by a number of processors that can be loosely coupled. In this way we obtain, under conditions that are easily met, a speedup that is linear on the number of available processing units. This speedup is verified by both theoretical and experimental analysis, also presented in this paper.