High efficiency on prediction of translation initiation site (TIS) of RefSeq sequences

Authors:
Cristiane N. Nobre;J. Miguel Ortega;Antônio de Pádua Braga
Affiliations:
Bioinformática, UFMG;Laboratório de Biodados, ICB, UFMG;Engenharia Eletrônica, UFMG
Venue:
BSB'07 Proceedings of the 2nd Brazilian conference on Advances in bioinformatics and computational biology
Year:
2007

Citing 11
Cited 0

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
The Ribosome Scanning Model for Translation Initiation: Implications for Gene Prediction and Full-Length cDNA Detection

ISMB '98 Proceedings of the 6th International Conference on Intelligent Systems for Molecular Biology
Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome Analysis

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
A class of edit kernels for SVMs to predict translation initiation sites in eukaryotic mRNAs

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A novel data mining approach for the accurate prediction of translation initiation sites

ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
Improving the accuracy of classifiers for the prediction of translation initiation sites in genomic sequences

PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Prediction of translation initiation sites using classifier selection

SETN'06 Proceedings of the 4th Helenic conference on Advances in Artificial Intelligence
Input space versus feature space in kernel-based methods

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important task in the area of gene discovery is the correct prediction of the translation initiation site (TIS). The TIS can correspond to the first AUG, but this is not always the case. This task can be modeled as a classification problem between positive (TIS) and negative patterns. Here we have used Support Vector Machine working with data processed by the class balancing method called Smote (Synthetic Minority Over-sampling Technique). Smote was used because the average imbalance has a positive/negative pattern ratio of around 1:28 for the databases used in this work. As a result we have attained accuracy, precision, sensitivity and specificity values of 99% on average.