Advancing the state of the art in computational gene prediction

  • Authors:
  • William H. Majoros;Uwe Ohler

  • Affiliations:
  • Center for Bioinformatics and Computational Biology, Institute for Genome Sciences and Policy, Duke University, Durham, NC;Center for Bioinformatics and Computational Biology, Institute for Genome Sciences and Policy, Duke University, Durham, NC

  • Venue:
  • KDECB'06 Proceedings of the 1st international conference on Knowledge discovery and emergent complexity in bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current methods for computationally predicting the locations and intron-exon structures of protein-coding genes in eukaryotic DNA are largely based on probabilistic, state-based generative models such as hidden Markov models and their various extensions. Unfortunately, little attention has been paid to the optimality of these models for the gene-parsing problem. Furthermore, as the prevalence of alternative splicing in human genes becomes more apparent, the "one gene, one parse" discipline endorsed by virtually all current gene-finding systems becomes less attractive from a biomedical perspective. Because our ability to accurately identify all the isoforms of each gene in the genome is of direct importance to biomedicine, our ability to improve gene-finding accuracy both for human and non-human DNA clearly has a potential to significantly impact human health. In this paper we review current methods and suggest a number of possible directions for further research that may alleviate some of these problems and ultimately lead to better and more useful gene predictions.