BioOptimizer: a Bayesian scoring function approach to motif discovery

Authors:
Shane T. Jensen;Jun S. Liu
Affiliations:
Department of Statistics, Harvard University, Cambridge, MA 02138-2901, USA;Department of Statistics, Harvard University, Cambridge, MA 02138-2901, USA
Venue:
Bioinformatics
Year:
2004

Citing 0
Cited 11

TFBS identification by position- and consensus-led genetic algorithm with local filtering

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Bayesian unsupervised learning of DNA regulatory binding regions

Advances in Artificial Intelligence
REFINEMENT: A search framework for the identification of interferon-responsive elements in DNA sequences - a case study with ISRE and GAS

Computational Biology and Chemistry
Enhancing motif refinement by incorporating comparative genomics data

ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
A two-block motif discovery method with improved accuracy

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
PFP: a computational framework for phylogenetic footprinting in prokaryotic genomes

ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
Scoring method for tumor prediction from microarray data using an evolutionary fuzzy classifier

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Finding gapped motifs by a novel evolutionary algorithm

EvoBIO'10 Proceedings of the 8th European conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
Motif yggdrasil: sampling from a tree mixture model

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
A Bayesian Scoring Scheme based Particle Swarm Optimization algorithm to identify transcription factor binding sites

Applied Soft Computing
Parallelizing a hybrid multiobjective differential evolution for identifying cis-regulatory elements

Proceedings of the 20th European MPI Users' Group Meeting

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: Transcription factors (TFs) bind directly to short segments on the genome, often within hundreds to thousands of base pairs upstream of gene transcription start sites, to regulate gene expression. The experimental determination of TFs binding sites is expensive and time-consuming. Many motif-finding programs have been developed, but no program is clearly superior in all situations. Practitioners often find it difficult to judge which of the motifs predicted by these algorithms are more likely to be biologically relevant. Results: We derive a comprehensive scoring function based on a full Bayesian model that can handle unknown site abundance, unknown motif width and two-block motifs with variable-length gaps. An algorithm called BioOptimizer is proposed to optimize this scoring function so as to reduce noise in the motif signal found by any motif-finding program. The accuracy of BioOptimizer, which can be used in conjunction with several existing programs, is shown to be superior to using any of these motif-finding programs alone when evaluated by both simulation studies and application to sets of co-regulated genes in bacteria. In addition, this scoring function formulation enables us to compare objectively different predicted motifs and select the optimal ones, effectively combining the strengths of existing programs. Availability: BioOptimizer is available for download at www.fas.harvard.edu/~junliu/BioOptimizer/