C4.5: programs for machine learning
C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Generating Accurate Rule Sets Without Global Optimization
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Translation Initiation Sites Prediction with Mixture Gaussian Models in Human cDNA Sequences
IEEE Transactions on Knowledge and Data Engineering
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
High efficiency on prediction of translation initiation site (TIS) of RefSeq sequences
BSB'07 Proceedings of the 2nd Brazilian conference on Advances in bioinformatics and computational biology
A novel data mining approach for the accurate prediction of translation initiation sites
ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
Hi-index | 0.00 |
The prediction of the translation initiation site (TIS) in a genomic sequence is an important issue in biological research. Several methods have been proposed to deal with it. However, it is still an open problem. In this paper we follow an approach consisting of a number of steps in order to increase TIS prediction accuracy. First, all the sequences are scanned and the candidate TISs are detected. These sites are grouped according to the length of the sequence upstream and downstream them and a number of features is generated for each one. The features are evaluated among the instances of every group and a number of the top ranked ones are selected for building a classifier. A new instance is assigned to a group and is classified by the corresponding classifier. We experiment with various feature sets and classification algorithms, compare with alternative methods and draw important conclusions.