A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Making large-scale support vector machine learning practical
Advances in kernel methods
ISMB '98 Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology
Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic mRNAs
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A novel data mining approach for the accurate prediction of translation initiation sites
ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Prediction of translation initiation sites using classifier selection
SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Input space versus feature space in kernel-based methods
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
An important task in the area of gene discovery is the correct prediction of the translation initiation site (TIS). The TIS can correspond to the first AUG, but this is not always the case. This task can be modeled as a classification problem between positive (TIS) and negative patterns. Here we have used Support Vector Machine working with data processed by the class balancing method called Smote (Synthetic Minority Over-sampling Technique). Smote was used because the average imbalance has a positive/negative pattern ratio of around 1:28 for the databases used in this work. As a result we have attained accuracy, precision, sensitivity and specificity values of 99% on average.