New techniques for extracting features from protein sequences
IBM Systems Journal - Deep computing for the life sciences
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
A comparison of methods for multiclass support vector machines
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Most of the existing methods for protein subcellular localization prediction are based on a large number of features that are considered to be potentially useful for determining protein subcellular localizations. However, predictors with large numbers of input variables usually suffer from the curse of dimensionality as well as the risk of overfitting. Using only those features that are relevant for protein subcellular localization might improve the prediction performance and might also provide us with some biologically useful knowledge. In this paper, we present a feature ranking based feature subset selection approach for subcellular localization prediction of proteins in the context of support vector machines (SVMs). Experimental results show that this method improves the prediction performance with selected subsets of features. It is anticipated that the proposed method will be a powerful tool for large-scale annotation of biological data.