Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Effective hidden Markov models for detecting splicing junction sites in DNA sequences
Information Sciences: an International Journal
Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications
Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications
Automatic Speech Recognition: The Development of the Sphinx Recognition System
Automatic Speech Recognition: The Development of the Sphinx Recognition System
Algorithms for Splicing Junction Donor Recognition in Genomic DNA Sequences
INTSYS '98 Proceedings of the IEEE International Joint Symposia on Intelligence and Systems
Application of Hidden Markov Models to Gene Prediction in DNA
ICIIS '99 Proceedings of the 1999 International Conference on Information Intelligence and Systems
Knowledge discovery and modeling in genomic databases
Knowledge discovery and modeling in genomic databases
New techniques for extracting features from protein sequences
IBM Systems Journal - Deep computing for the life sciences
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hierarchical clustering of mixed data based on distance hierarchy
Information Sciences: an International Journal
Two methods for privacy preserving data mining with malicious participants
Information Sciences: an International Journal
Kernel design for RNA classification using Support Vector Machines
International Journal of Data Mining and Bioinformatics
Information Sciences: an International Journal
Information Sciences: an International Journal
Hi-index | 0.00 |
Automated detection or prediction of coding sequences from within genomic DNA has been a major rate-limiting step in the pursuit of vertebrate genes. Programs currently available are far from being powerful enough to elucidate a gent structure completely. In this paper, we present a new system, called GeneScout, for predicting gene structures in vertebrate genomic DNA. The system contains specially designed hidden Markov models (HMMs) for detecting functional sites including proteintranslation start sites, mRNA splicing junction donor and acceptor sites, etc. An HMM model is also proposed for exon coding potential computation. Our main hypothesis is that, given a vertebrate genomic DNA sequence S, it is always possible to construct a directed acyclic graph G such that the path for the actual coding region of S is in the set of all paths on G. Thus, the gene detection problem is reduced to that of analyzing the paths in the graph G. A dynamic programming algorithm is used to lind the optimal path in G. The proposed system is trained using an expectation-maximization algorithm and its performance on vertebrate gene prediction is evaluated using the 10-way cross-validation method. Experimental results show that the proposed system performs well and is comparable to existing gene discovery tools.