Prediction of translation initiation sites using classifier selection

Authors:
George Tzanis;Ioannis Vlahavas
Affiliations:
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece;Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Venue:
SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Year:
2006

Citing 7
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Generating Accurate Rule Sets Without Global Optimization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome Analysis

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Translation Initiation Sites Prediction with Mixture Gaussian Models in Human cDNA Sequences

IEEE Transactions on Knowledge and Data Engineering
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Improving the accuracy of classifiers for the prediction of translation initiation sites in genomic sequences

PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics

High efficiency on prediction of translation initiation site (TIS) of RefSeq sequences

BSB'07 Proceedings of the 2nd Brazilian conference on Advances in bioinformatics and computational biology
A novel data mining approach for the accurate prediction of translation initiation sites

ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

The prediction of the translation initiation site (TIS) in a genomic sequence is an important issue in biological research. Several methods have been proposed to deal with it. However, it is still an open problem. In this paper we follow an approach consisting of a number of steps in order to increase TIS prediction accuracy. First, all the sequences are scanned and the candidate TISs are detected. These sites are grouped according to the length of the sequence upstream and downstream them and a number of features is generated for each one. The features are evaluated among the instances of every group and a number of the top ranked ones are selected for building a classifier. A new instance is assigned to a group and is classified by the corresponding classifier. We experiment with various feature sets and classification algorithms, compare with alternative methods and draw important conclusions.