Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Machine Learning - Special issue on applications in molecular biology
Finding similar regions in many strings
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
On the closest string and substring problems
Journal of the ACM (JACM)
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
CSBW '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference - Workshops
Hi-index | 0.00 |
Identification of the short DNA sequence motif, which serves as binding targets for transcription factors, is a fundamental problem in both computer science and molecular biology. Especially, finding the subtle motifs with variable gaps is more challenging. In this paper, a new algorithm is presented, which explores some new strategies. Based on a neighbourhood set concept, a new probability matrix is defined, which can capture the target motifs effectively. An iterative restart strategy is used, by which we can use several similar motifs' information to detect the real motif to demonstrate the effectiveness of our algorithm. We test it on several kinds of data and compare it with some other current representation algorithms. Simulation shows that the algorithm can effectively detect the subtle motifs.