Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
ExonHunter: a comprehensive approach to gene finding
Bioinformatics
Comparative Exon Prediction based on Heuristic Coding Region Alignment
ISPAN '05 Proceedings of the 8th International Symposium on Parallel Architectures,Algorithms and Networks
Hi-index | 0.00 |
Identifying protein coding genes is one of most important task in newly sequenced genomes. With increasing numbers of gene annotations verified by experiments, it is feasible to identify genes in newly sequenced genomes by comparing with genes annotated on phylogenetically close organisms. Here, we propose a program, GeneAlign, which predicts the genes on one sequence by measuring the similarity between the predicted sequence and related genes annotated on another genome. The program applies CORAL, a heuristic linear time alignment tool, to determine whether the regions flanked by candidate signals are similar with the annotated exons or not. The approach, which employs the conservation of gene structures and sequence homologies between protein coding regions, increases the prediction accuracy. GeneAlign was tested on Projector data set of 449 human-mouse homologous sequence pairs. At the gene level, the sensitivity and specificity of GeneAlign are 80%, and larger than 96% at the exon level.