Machine Learning - Special issue on learning with probabilistic representations
Bayesian Networks and Decision Graphs
Bayesian Networks and Decision Graphs
Efficient greedy learning of Gaussian mixture models
Neural Computation
Improving Promoter Prediction Using Multiple Instance Learning
AI '08 Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
ICIC'11 Proceedings of the 7th international conference on Intelligent Computing: bio-inspired computing and applications
Hi-index | 0.03 |
Objective:: The gene promoter region controls transcriptional initiation of a gene, which is the most important step in gene regulation. In-silico detection of promoter region in genomic sequences has a number of applications in gene discovery and understanding gene expression regulation. However, computational prediction of eukaryotic poly-II promoters has remained a difficult task. This paper introduces a novel statistical technique for detecting promoter regions in long genomic sequences. Method:: A number of existing techniques analyze the occurrence frequencies of oligonucleotides in promoter sequences as compared to other genomic regions. In contrast, the present work studies the positional densities of oligonucleotides in promoter sequences. The analysis does not require any non-promoter sequence dataset or any model of the background oligonucleotide content of the genome. The statistical model learnt from a dataset of promoter sequences automatically recognizes a number of transcription factor binding sites simultaneously with their occurrence positions relative to the transcription start site. Based on this model, a continuous naive Bayes classifier is developed for the detection of human promoters and transcription start sites in genomic sequences. Results:: The present study extends the scope of statistical models in general promoter modeling and prediction. Promoter sequence features learnt by the model correlate well with known biological facts. Results of human transcription start site prediction compare favorably with existing 2nd generation promoter prediction tools.