The nature of statistical learning theory
The nature of statistical learning theory
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Prediction of Protein Secondary Structure with two-stage multi-class SVMs
International Journal of Data Mining and Bioinformatics
Gene Classification Using Codon Usage and Support Vector Machines
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Machine Learning in Bioinformatics
Machine Learning in Bioinformatics
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Classification of genes into biologically related groups facilitates inference of their functions. Codon usage bias has been described previously as a potential feature for gene classification. In this paper, we demonstrate that di-codon usage can further improve classification of genes. By using both codon and di-codon features, we achieve near perfect accuracies for the classification of HLA molecules into major classes and sub-classes. The method is illustrated on 1,841 HLA sequences which are classified into two major classes, HLA-I and HLA-II. Major classes are further classified into sub-groups. A binary SVM using di-codon usage patterns achieved 99.95% accuracy in the classification of HLA genes into major HLA classes; and multi-class SVM achieved accuracy rates of 99.82% and 99.03% for sub-class classification of HLA-I and HLA-II genes, respectively. Furthermore, by combining codon and di-codon usages, the prediction accuracies reached 100%, 99.82%, and 99.84% for HLA major class classification, and for sub-class classification of HLA-I and HLA-II genes, respectively.