Performance standards and evaluations in IR test collections: cluster-based retrieval models
Information Processing and Management: an International Journal
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
FMGA: Finding Motifs by Genetic Algorithm
BIBE '04 Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering
MDGA: motif discovery using a genetic algorithm
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Bioinformatics
Motif discoveries in unaligned molecular sequences using self-organizing neural networks
IEEE Transactions on Neural Networks
iGAPK: improved GAPK algorithm for regulatory DNA motif discovery
ICONIP'10 Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II
Information Sciences: an International Journal
Hi-index | 0.00 |
Discovery of transcription factor binding sites (TFBSs) or DNA motifs in promoter regions of genes plays a key role in understanding the regulations of gene expression. In the past decade computational approaches, including evolutionary computation techniques, for searching for motifs have demonstrated good potential, and some results reported in literature are quite promising. Recently, some favorable progresses on evolutionary mining of motifs have been made and documented in GAME and GALF-P, where GAME employs a Bayesian-based scoring function and GALF-P aims to improve the algorithm performance with local filtering and adaptive post-processing. To improve discovering performance in terms of the recall, precision rates and algorithm reliability, this paper presents an alternative genetic algorithm termed as GAPK for resolving the problem of motifs discovery. In our proposed GAPK framework, a prior knowledge on motifs in a given dataset is used to initialize a population. Our technical contributions include a matrix representation for k-mers, a mismatch-based filtering method for search space reduction, a model mismatch score (MMS) as fitness function, new genetic operations and a model refinement processing. Some benchmarked datasets associated with eight transcription factors are used in our experiments. Comparative studies were carried out with well-known tools including GAME, GALF-P, MEME, MDScan and AlignACE. Results show that our method outperforms other techniques in terms of F-measure.