Instance-Based Learning Algorithms
Machine Learning
The nature of statistical learning theory
The nature of statistical learning theory
A fast fixed-point algorithm for independent component analysis
Neural Computation
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
Data mining: concepts and techniques
Data mining: concepts and techniques
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Di-codon Usage for Gene Classification
PRIB '09 Proceedings of the 4th IAPR International Conference on Pattern Recognition in Bioinformatics
Expert Systems with Applications: An International Journal
Conotoxin protein classification using pairwise comparison and amino acid composition: toxin-aam
Proceedings of the 13th annual conference on Genetic and evolutionary computation
An efficient classification approach for large-scale mobile ubiquitous computing
Information Sciences: an International Journal
Identification of bacillus species using support vector machine and codon pair relative frequency
Proceedings of the 8th International Conference on Ubiquitous Information Management and Communication
Hi-index | 0.00 |
Abstract-- A novel approach for gene classification, which adopts codon usage bias as input feature vector for classification by support vector machines (SVM) is proposed. The DNA sequence is first converted to a 59-dimensional feature vector where each element corresponds to the relative synonymous usage frequency of a codon. As the input to the classifier is independent of sequence length and variance, our approach is useful when the sequences to be classified are of different lengths, a condition that homology-based methods tend to fail. The method is demonstrated by using 1,841 Human Leukocyte Antigen (HLA) sequences which are classified into two major classes: HLA-I and HLA-II; each major class is further subdivided into sub-groups of HLA-I and HLA-II molecules. Using codon usage frequencies, binary SVM achieved accuracy rate of 99.3% for HLA major class classification and multi-class SVM achieved accuracy rates of 99.73% and 98.38% for sub-class classification of HLA-I and HLA-II molecules, respectively. The results show that gene classification based on codon usage bias is consistent with the molecular structures and biological functions of HLA molecules.