The nature of statistical learning theory
The nature of statistical learning theory
Better Prediction of Protein Cellular Localization Sites with the it k Nearest Neighbors Classifier
Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology
Hi-index | 0.00 |
Sub-cellular localisation performs an important role in genome analysis. This paper describes a new residue-couple model using a support vector machine to predict the sub-cellular localisation of proteins. This new approach provides better predictions than the existing methods. The total prediction accuracies on Reinhardt and Hubbard's dataset reach 92.0% for prokaryotic protein sequences and 86.9% for eukaryotic protein sequences with fivefold cross validation. For a new dataset with 8304 proteins located in eight sub-cellular locations, the total accuracy achieves 88.9%. Meanwhile, the model shows robust against N-terminal errors in the sequences.