Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Combination of support vector machines using genetic programming
International Journal of Hybrid Intelligent Systems
Combination and optimization of classifiers in gender classification using genetic programming
International Journal of Knowledge-based and Intelligent Engineering Systems
Prediction of protein subcellular location using hydrophobic patterns of amino acid sequence
Computational Biology and Chemistry
Predicting O-glycosylation sites in mammalian proteins by using SVMs
Computational Biology and Chemistry
Computational Biology and Chemistry
Computational Biology and Chemistry
An evidence-theoretic k-NN rule with parameter optimization
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A support vector machine formulation to PCA analysis and its kernel version
IEEE Transactions on Neural Networks
Multilabel Learning via Random Label Selection for Protein Subcellular Multilocations Prediction
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
Precise information about protein locations in a cell facilitates in the understanding of the function of a protein and its interaction in the cellular environment. This information further helps in the study of the specific metabolic pathways and other biological processes. We propose an ensemble approach called ''CE-PLoc'' for predicting subcellular locations based on fusion of individual classifiers. The proposed approach utilizes features obtained from both dipeptide composition (DC) and amphiphilic pseudo amino acid composition (PseAAC) based feature extraction strategies. Different feature spaces are obtained by varying the dimensionality using PseAAC for a selected base learner. The performance of the individual learning mechanisms such as support vector machine, nearest neighbor, probabilistic neural network, covariant discriminant, which are trained using PseAAC based features is first analyzed. Classifiers are developed using same learning mechanism but trained on PseAAC based feature spaces of varying dimensions. These classifiers are combined through voting strategy and an improvement in prediction performance is achieved. Prediction performance is further enhanced by developing CE-PLoc through the combination of different learning mechanisms trained on both DC based feature space and PseAAC based feature spaces of varying dimensions. The predictive performance of proposed CE-PLoc is evaluated for two benchmark datasets of protein subcellular locations using accuracy, MCC, and Q-statistics. Using the jackknife test, prediction accuracies of 81.47 and 83.99% are obtained for 12 and 14 subcellular locations datasets, respectively. In case of independent dataset test, prediction accuracies are 87.04 and 87.33% for 12 and 14 class datasets, respectively.