Modeling evolutionary fitness for DNA motif discovery
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
GAPK: genetic algorithms with prior knowledge for motif discovery in DNA sequences
CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Moitf GibbsGA: Sampling Transcription Factor Binding Sites Coupled with PSFM Optimization by GA
ISICA '09 Proceedings of the 4th International Symposium on Advances in Computation and Intelligence
An improved genetic algorithm for DNA motif discovery with public domain information
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Challenges rising from learning motif evaluation functions using genetic programming
Proceedings of the 12th annual conference on Genetic and evolutionary computation
A Cluster Refinement Algorithm for Motif Discovery
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
iGAPK: improved GAPK algorithm for regulatory DNA motif discovery
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II
Finding gapped motifs by a novel evolutionary algorithm
EvoBIO'10 Proceedings of the 8th European conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics
Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
Hi-index | 3.84 |
Motivation: Identification of transcription factor binding sites (TFBSs) plays an important role in deciphering the mechanisms of gene regulation. Recently, GAME, a Genetic Algorithm (GA)-based approach with iterative post-processing, has shown superior performance in TFBS identification. However, the basic GA in GAME is not elaborately designed, and may be trapped in local optima in real problems. The feature operators are only applied in the post-processing, but the final performance heavily depends on the GA output. Hence, both effectiveness and efficiency of the overall algorithm can be improved by introducing more advanced representations and novel operators in the GA, as well as designing the post-processing in an adaptive way. Results: We propose a novel framework GALF-P, consisting of Genetic Algorithm with Local Filtering (GALF) and adaptive post-processing techniques (-P), to achieve both effectiveness and efficiency for TFBS identification. GALF combines the position-led and consensus-led representations used separately in current GAs and employs a novel local filtering operator to get rid of false positives within an individual efficiently during the evolutionary process in the GA. Pre-selection is used to maintain diversity and avoid local optima. Post-processing with adaptive adding and removing is developed to handle general cases with arbitrary numbers of instances per sequence. GALF-P shows superior performance to GAME, MEME, BioProspector and BioOptimizer on synthetic datasets with difficult scenarios and real test datasets. GALF-P is also more robust and reliable when further compared with GAME, the current state-of-the-art approach. Availability: http://www.cse.cuhk.edu.hk/~tmchan/GALFP/ Contact: tmchan@cse.cuhk.edu.hk Supplementary information: Supplementary data are available at Bioinformatics online.