Deterministic local alignment methods improved by a simple genetic algorithm

Authors:
Chengpeng Bi
Affiliations:
Bioinformatics and Intelligent Computing Lab, Division of Clinical Pharmacology, Children's Mercy Hospitals, Schools of Medicine, Computing and Engineering, University of Missouri, Kansas City, MO ...
Venue:
Neurocomputing
Year:
2010

Citing 12
Cited 2

Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization

Machine Learning - Special issue on applications in molecular biology
Learning mixture models using a genetic version of the EM algorithm

Pattern Recognition Letters
An Introduction to Genetic Algorithms

An Introduction to Genetic Algorithms
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
FMGA: Finding Motifs by Genetic Algorithm

BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
MDGA: motif discovery using a genetic algorithm

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Genetic-Based EM Algorithm for Learning Gaussian Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
GAME: detecting cis-regulatory elements using a genetic algorithm

Bioinformatics
Monte Carlo Strategies in Scientific Computing

Monte Carlo Strategies in Scientific Computing
A Monte Carlo EM Algorithm for De Novo Motif Discovery in Biomolecular Sequences

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Comparison of optimization techniques for sequence pattern discovery by maximum-likelihood

Pattern Recognition Letters
A tutorial for competent memetic algorithms: model, taxonomy, and design issues

IEEE Transactions on Evolutionary Computation

Memetic algorithms for de novo motif-finding in biomedical sequences

Artificial Intelligence in Medicine
Learning of a single-hidden layer feedforward neural network using an optimized extreme learning machine

Neurocomputing

Quantified Score

Hi-index	0.01

Visualization

Abstract

Multiple sequence local alignment, often deployed for de novo discovery of biological motifs hidden in a set of DNA or protein sequences, remains a challenge in bioinformatics and computational biology. Many algorithms and software packages have been developed to address the problem. Expectation maximization (EM), one of the popular local alignment methods, is often used to solve the motif-finding problem. However, EM largely depends on its initialization and can be easily trapped in local optima. This paper presents the Genetic-enabled EM Motif-Finding Algorithm (GEMFA) in an effort to mitigate the difficulties confronted the EM-based motif discovery algorithms. The new algorithm integrates a simple genetic algorithm (GA) with a local searcher to explore the local alignment space, that is, it combines deterministic local alignment methods with a simple GA to effectively perform de novo motif discovery. It first initializes a population of multiple local alignments each of which is encoded on a chromosome that represents a potential solution. GEMFA then performs heuristic search in the whole alignment space using minimum distance length (MDL) as the fitness function, which is generalized from maximum log-likelihood. The genetic algorithm gradually moves this population towards the best alignment from which the motif model is derived. Simulated and real biological sequence analysis showed that GEMFA significantly improved deterministic local alignment methods especially in the subtle motif sequence alignment, and it also outperformed other algorithms tested.