Probabilistic prediction of protein phosphorylation sites using kernel machines

Authors:
Mark Menor;Guylaine Poisson;Kyungim Baek
Affiliations:
University of Hawai'i at Manoa, Honolulu, HI;University of Hawai'i at Manoa, Honolulu, HI;University of Hawai'i at Manoa, Honolulu, HI
Venue:
Proceedings of the 27th Annual ACM Symposium on Applied Computing
Year:
2012

Citing 7
Cited 0

A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Support-Vector Networks

Machine Learning
X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Sparse bayesian learning and the relevance vector machine

The Journal of Machine Learning Research
Prediction of phosphorylation sites using SVMs

Bioinformatics
NetPhosYeast

Bioinformatics
Sparse Kernel Learning and the Relevance Units Machine

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Phosphorylation is an important post-translational modification of proteins that is essential to the regulation of many cellular process. The in vivo and in vitro discovery of phos-phorylation sites is an expensive, time-consuming and laborious task. In this preliminary study, we assess the viability of using our proposed probabilistic Classification Relevance Units Machine (CRUM) for in silico phosphorylation site prediction. We conduct a comparison with the popular Support Vector Machine (SVM) and the Relevance Vector Machine (RVM) that, unlike the SVM, has not been applied to phosphorylation site prediction. The resulting CRUM and RVM predictors offer comparable predictive performance to the SVM. The main advantages of CRUM and RVM over the SVM are: 1. An estimation of the posterior probability of the site being phosphorylatable, providing biologists an important measurement of the uncertainty of the prediction. 2. A more parsimonious model, leading to a reduction in prediction run-time that is important in predictions on large-scale data. Furthermore, the CRUM training algorithm has lower runtime and memory complexity and has a simpler parameter selection scheme than the RVM learning algorithm. Therefore we conclude that the CRUM is the most viable kernel machine for probabilistic prediction of protein phosphorylation sites.