PAC-Bayesian Stochastic Model Selection

Authors:
David A. McAllester
Affiliations:
AT&T Shannon Labs, 180 Park Avenue, Florham Park, NJ 07932-0971, USA. dmac@research.att.com
Venue:
Machine Learning
Year:
2003

Citing 13
Cited 23

Results on learnability and the Vapnik-Chervonenkis dimension

Information and Computation
C4.5: programs for machine learning

C4.5: programs for machine learning
The weighted majority algorithm

Information and Computation
An experimental and theoretical comparison of model selection methods

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
Predicting Nearly As Well As the Best Pruning of a Decision Tree

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
How to use expert advice

Journal of the ACM (JACM)
Using and combining predictors that specialize

STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
A PAC analysis of a Bayesian estimator

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
An efficient extension to mixture techniques for prediction and decision trees

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Some PAC-Bayesian theorems

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
PAC-Bayesian model averaging

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Concept learning using complexity regularization

IEEE Transactions on Information Theory

Pac-bayesian generalisation error bounds for gaussian process classification

The Journal of Machine Learning Research
Generalization error bounds for Bayesian mixture algorithms

The Journal of Machine Learning Research
Semi-supervised learning using randomized mincuts

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A comparison of tight generalization error bounds

ICML '05 Proceedings of the 22nd international conference on Machine learning
PAC-Bayes risk bounds for sample-compressed Gibbs classifiers

ICML '05 Proceedings of the 22nd international conference on Machine learning
On Bayesian bounds

ICML '06 Proceedings of the 23rd international conference on Machine learning
PAC-Bayesian learning of linear classifiers

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Explicit learning curves for transduction and application to clustering and compression algorithms

Journal of Artificial Intelligence Research
Transductive Rademacher complexity and its applications

Journal of Artificial Intelligence Research
Learning Permutations with Exponential Weights

The Journal of Machine Learning Research
Occam's hammer

COLT'07 Proceedings of the 20th annual conference on Learning theory
Learning permutations with exponential weights

COLT'07 Proceedings of the 20th annual conference on Learning theory
Learning with randomized majority votes

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Human Action Segmentation and Recognition Using Discriminative Semi-Markov Models

International Journal of Computer Vision
Confirmation in the Cognitive Sciences: The Problematic Case of Bayesian Models

Minds and Machines
The missing consistency theorem for bayesian learning: stochastic model selection

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Generalization error bounds using unlabeled data

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Sparse regression learning by aggregation and Langevin Monte-Carlo

Journal of Computer and System Sciences
The safe bayesian: learning the learning rate via the mixability gap

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
A Computational Learning Theory of Active Object Recognition Under Uncertainty

International Journal of Computer Vision
Simultaneous Bayesian clustering and feature selection using RJMCMC-based learning of finite generalized Dirichlet mixture models

Signal Processing
Scalable inference in max-margin topic models

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Generalized relational topic models with data augmentation

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

PAC-Bayesian learning methods combine the informative priors of Bayesian methods with distribution-free PAC guarantees. Stochastic model selection predicts a class label by stochastically sampling a classifier according to a “posterior distribution” on classifiers. This paper gives a PAC-Bayesian performance guarantee for stochastic model selection that is superior to analogous guarantees for deterministic model selection. The guarantee is stated in terms of the training error of the stochastic classifier and the KL-divergence of the posterior from the prior. It is shown that the posterior optimizing the performance guarantee is a Gibbs distribution. Simpler posterior distributions are also derived that have nearly optimal performance guarantees.