C4.5: programs for machine learning
C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
High efficiency on prediction of translation initiation site (TIS) of RefSeq sequences
BSB'07 Proceedings of the 2nd Brazilian conference on Advances in bioinformatics and computational biology
A novel data mining approach for the accurate prediction of translation initiation sites
ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
Prediction of translation initiation sites using classifier selection
SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
The prediction of the Translation Initiation Site (TIS) in a genomic sequence is an important issue in biological research. Although several methods have been proposed to deal with this problem, there is a great potential for the improvement of the accuracy of these methods. Due to various reasons, including noise in the data as well as biological reasons, TIS prediction is still an open problem and definitely not a trivial task. In this paper we follow a three-step approach in order to increase TIS prediction accuracy. In the first step, we use a feature generation algorithm we developed. In the second step, all the candidate features, including some new ones generated by our algorithm, are ranked according to their impact to the accuracy of the prediction. Finally, in the third step, a classification model is built using a number of the top ranked features. We experiment with various feature sets, feature selection methods and classification algorithms, compare with alternative methods, draw important conclusions and propose improved models with respect to prediction accuracy.