Efficient composite pattern finding from monad patterns

Authors:
Jianjun Zhou;Jorg Sander;Guohui Lin
Affiliations:
Bioinformatics Research Group, Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada.;Bioinformatics Research Group, Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada.;Bioinformatics Research Group, Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada
Venue:
International Journal of Bioinformatics Research and Applications
Year:
2007

Citing 6
Cited 1

LEDA: a platform for combinatorial and geometric computing

Communications of the ACM
Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization

Machine Learning - Special issue on applications in molecular biology
Finding motifs in the twilight zone

Proceedings of the sixth annual international conference on Computational biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Spelling Approximate Repeated or Common Motifs Using a Suffix Tree

LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics

An efficient algorithm for planted structured motif extraction

Proceedings of the 1st ACM workshop on Breaking frontiers of computational biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatically identifying frequent composite patterns in DNA sequences is an important task in bioinformatics, especially when all the basic elements (or monad patterns) of a composite pattern are weak. In this paper, we compare one straightforward approach to assemble the monad patterns into composite patterns to two other rather complex approaches. Both our theoretical analysis and empirical results show that this overlooked straightforward method can be several orders of magnitude faster. Furthermore, different from the previous understandings, the empirical results show that the runtime superiority among the three approaches is closely related to the insignificance of the monad patterns.