Feature Subset Selection Using a Genetic Algorithm
IEEE Intelligent Systems
Automated data-driven discovery of motif-based protein function classifiers
Information Sciences: an International Journal
Evolutionary tuning of multiple SVM parameters
Neurocomputing
Dimensionality reduction using genetic algorithms
IEEE Transactions on Evolutionary Computation
Role and Results of statistical methods in protein fold class prediction
Mathematical and Computer Modelling: An International Journal
A novel kernel-based approach for predicting binding peptides for HLA class II molecules
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
ACIVS'07 Proceedings of the 9th international conference on Advanced concepts for intelligent vision systems
Global optimization of support vector machines using genetic algorithms for bankruptcy prediction
ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
Hi-index | 0.00 |
This paper presents a novel approach to extracting features from motif content and protein composition for protein sequence classification. First, we formulate a protein sequence as a fixed-dimensional vector using the motif content and protein composition. Then, we further project the vectors into a low-dimensional space by the Principal Component Analysis (PCA) so that they can be represented by a combination of the eigenvectors of the covariance matrix of these vectors. Subsequently, the Genetic Algorithm (GA) is used to extract a subset of biological and functional sequence features from the eigen-space and to optimize the regularization parameter of the Support Vector Machine (SVM) simultaneously. Finally, we utilize the SVM classifiers to classify protein sequences into corresponding families based on the selected feature subsets. In comparison with the existing PSI-BLAST and SVM-pairwise methods, the experiments show the promising results of our approach.