Finding exact optimal motifs in matrix representation by partitioning

Authors:
Henry C. M. Leung;Francis Y. L. Chin
Affiliations:
Department of Computer Science, The University of Hong Kong Pokfulam Road, Hong Kong;Department of Computer Science, The University of Hong Kong Pokfulam Road, Hong Kong
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 5

DNA Motif Representation with Nucleotide Dependency

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
An efficient motif discovery algorithm with unknown motif length and number of binding sites

International Journal of Data Mining and Bioinformatics
Improved pattern-driven algorithms for motif finding in DNA sequences

RECOMB'05 Proceedings of the 2005 joint annual satellite conference on Systems biology and regulatory genomics
A two-block motif discovery method with improved accuracy

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Generalized planted (l,d)-motif problem with negative set

WABI'05 Proceedings of the 5th International conference on Algorithms in Bioinformatics

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Finding common patterns, or motifs, in the promoter regions of co-expressed genes is an important problem in bioinformatics. A common representation of the motif is by probability matrix or PSSM (position specific scoring matrix). However, even for a motif of length six or seven, there is no algorithm that can guarantee finding the exact optimal matrix from an infinite number of possible matrices. Results: This paper introduces the first algorithm, called EOMM, for finding the exact optimal matrix-represented motif, or simply optimal motif. Based on branch-and-bound searching by partitioning the solution space recursively, EOMM can find the optimal motif of size up to eight or nine, and a motif of larger size with any desired accuracy on the principle that the smaller the error bound, the longer the running time. Experiments show that for some real and simulated data sets, EOMM finds the motif despite very weak signals when existing software, such as MEME and MITRA-PSSM, fails to do so. Availability: Contact: cmleung2@cs.hku.hk