An improved voting algorithm for planted (l,d) motif search

Authors:
Yun Xu;Jiaoyun Yang;Yuzhong Zhao;Yi Shang
Affiliations:
-;-;-;-
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 13
Cited 0

Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization

Machine Learning - Special issue on applications in molecular biology
Effective hidden Markov models for detecting splicing junction sites in DNA sequences

Information Sciences: an International Journal
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Spelling Approximate Repeated or Common Motifs Using a Suffix Tree

LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics
On the complexity of finding common approximate substrings

Theoretical Computer Science
Fast and Practical Algorithms for Planted (l, d) Motif Search

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Mining gene expression data with pattern structures in formal concept analysis

Information Sciences: an International Journal
An Improved Heuristic Algorithm for Finding Motif Signals in DNA Sequences

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Fast Exact Algorithms for the Closest String and Substring Problems with Application to the Planted (L,d)-Motif Model

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Tree-structured algorithm for long weak motif discovery

Bioinformatics
RISOTTO: fast extraction of motifs with mismatches

LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
A compact hybrid feature vector for an accurate secondary structure prediction

Information Sciences: an International Journal
PMS6: A fast algorithm for motif discovery

ICCABS '12 Proceedings of the 2012 IEEE 2nd International Conference on Computational Advances in Bio and medical Sciences

Quantified Score

Hi-index	0.07

Visualization

Abstract

The planted motif search problem is a classical problem in bioinformatics that seeks to identify meaningful patterns in biological sequences. As an NP-complete problem, current algorithms focus on improving the average time complexity and solving challenging instances within an acceptable time. In this paper, we propose a new exact algorithm CVoting that improves the state-of-the-art Voting algorithm. CVoting uses a new hash technique to reduce the space complexity to O(mn+N(l,d)) and a new pruning technique to reduce the average time complexity to Om^2nN(l,d)14+3l^l. Experimental results show that CVoting outperforms competing algorithms, including PMS1, RISOTTO, Voting and Pmsprune, in both space and time: up to an order of magnitude faster and using less memory in solving challenging instances. The software of the proposed algorithm is publicly available at http://staff.ustc.edu.cn/xuyun/motif.