Communications of the ACM
Elements of information theory
Elements of information theory
Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension
Machine Learning - Special issue on computational learning theory
Bayesian Classification With Gaussian Processes
IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Prediction with Gaussian processes: from linear regression to linear prediction and beyond
Learning in graphical models
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Learning curves for Gaussian processes
Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
AI Game Programming Wisdom
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
PAC-Bayesian Stochastic Model Selection
Machine Learning
Sparse on-line Gaussian processes
Neural Computation
A family of algorithms for approximate bayesian inference
A family of algorithms for approximate bayesian inference
Sparse bayesian learning and the relevance vector machine
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Neural Computation
Gaussian Processes for Classification: Mean-Field Algorithms
Neural Computation
Structural risk minimization over data-dependent hierarchies
IEEE Transactions on Information Theory
Generalization error bounds for Bayesian mixture algorithms
The Journal of Machine Learning Research
Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds
IEEE Transactions on Pattern Analysis and Machine Intelligence
PAC-Bayes risk bounds for sample-compressed Gibbs classifiers
ICML '05 Proceedings of the 22nd international conference on Machine learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Assessing Approximate Inference for Binary Gaussian Process Classification
The Journal of Machine Learning Research
PAC-Bayesian learning of linear classifiers
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Model Selection: Beyond the Bayesian/Frequentist Divide
The Journal of Machine Learning Research
Selective sampling for classification
Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
The Journal of Machine Learning Research
Learning with randomized majority votes
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Distribution-dependent PAC-bayes priors
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
A PAC-bayes bound for tailored density estimation
ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Covariance in Unsupervised Learning of Probabilistic Grammars
The Journal of Machine Learning Research
PAC-Bayesian Analysis of Co-clustering and Beyond
The Journal of Machine Learning Research
International Journal of Robotics Research
Variational multinomial logit gaussian process
The Journal of Machine Learning Research
The safe bayesian: learning the learning rate via the mixability gap
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Tighter PAC-Bayes bounds through distribution-dependent priors
Theoretical Computer Science
PAC-bayes bounds with data dependent priors
The Journal of Machine Learning Research
Hi-index | 0.00 |
Approximate Bayesian Gaussian process (GP) classification techniques are powerful non-parametric learning methods, similar in appearance and performance to support vector machines. Based on simple probabilistic models, they render interpretable results and can be embedded in Bayesian frameworks for model selection, feature selection, etc. In this paper, by applying the PAC-Bayesian theorem of McAllester (1999a), we prove distribution-free generalisation error bounds for a wide range of approximate Bayesian GP classification techniques. We also provide a new and much simplified proof for this powerful theorem, making use of the concept of convex duality which is a backbone of many machine learning techniques. We instantiate and test our bounds for two particular GPC techniques, including a recent sparse method which circumvents the unfavourable scaling of standard GP algorithms. As is shown in experiments on a real-world task, the bounds can be very tight for moderate training sample sizes. To the best of our knowledge, these results provide the tightest known distribution-free error bounds for approximate Bayesian GPC methods, giving a strong learning-theoretical justification for the use of these techniques.