Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Machine Learning - Special issue on applications in molecular biology
Finding motifs using random projections
RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Finding similar regions in many sequences
Journal of Computer and System Sciences - STOC 1999
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Spelling Approximate Repeated or Common Motifs Using a Suffix Tree
LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics
cWINNOWER Algorithm for Finding Fuzzy DNA Motifs
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Finding motifs for insufficient number of sequences with strong binding to transcription facto
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Best fitting fixed-length substring patterns for a set of strings
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Clustering sequences by overlap
International Journal of Data Mining and Bioinformatics
Modelling splice sites with locality-sensitive sequence features
International Journal of Data Mining and Bioinformatics
Alns: a new searchable and filterable sequence alignment format
International Journal of Data Mining and Bioinformatics
MAIL: mining sequential patterns with wildcards
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
Most motif discovery algorithms from DNA sequences require the motif's length as input. Styczynski et al. introduced the Extended (l,d)-Motif Problem (EMP) where the motif's length is not an input parameter. Unfortunately, their algorithm takes an unacceptably long time to run, e.g. over 3 months to discover a length-14 motif. Since the best motif may not be the longest nor have the largest number of binding sites, in this paper we further eliminate another input parameter about the minimum number of binding sites in order to provide more realistic/robust results. We also develop an efficient algorithm to solve EMP and this redefined problem.