Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension

Authors:
David Haussler;Michael Kearns;Robert E. Schapire
Affiliations:
Computer and Information Sciences, University of California, Santa Cruz, CA 95064. HAUSSLER@CSE.UCSC.EDU;AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974. MKEARNS@RESEARCH.ATT.COM;AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974.SCHAPIRE@RESEARCH.ATT.COM
Venue:
Machine Learning - Special issue on computational learning theory
Year:
1994

Citing 0
Cited 30

How to use expert advice

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
General bounds on the mutual information between a parameter and n conditionally independent observations

COLT '95 Proceedings of the eighth annual conference on Computational learning theory
How to use expert advice

Journal of the ACM (JACM)
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Improved lower bounds for learning from noisy examples: an information-theoretic approach

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Prequential and Cross-Validated Regression Estimation

Machine Learning
Theoretical analysis of a class of randomized regularization methods

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Some results concerning off-training-set and IID error for the Gibbs and the Bayes optimal generalizers

Statistics and Computing
On the capabilities of neural networks using limited precision weights

Neural Networks
An approach to guaranteeing generalisation in neural networks

Neural Networks
Query by committee, linear separation and random walks

Theoretical Computer Science
Query by Committee, Linear Separation and Random Walks

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Bounds on the Generalization Ability of Bayesian Inference and Gibbs Algorithms

ICANN '01 Proceedings of the International Conference on Artificial Neural Networks
On the complexity of computing and learning with multiplicative neural networks

Neural Computation
Pac-bayesian generalisation error bounds for gaussian process classification

The Journal of Machine Learning Research
Distinctive Features of Minimization of a Risk Functional in Mass Data Sets

Cybernetics and Systems Analysis
Predictability, Complexity, and Learning

Neural Computation
QG/GA: a stochastic search for Progol

Machine Learning
Classification and Retrieval through Semantic Kernels

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Boosting Active Learning to Optimality: A Tractable Monte-Carlo, Billiard-Based Algorithm

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Model Selection: Beyond the Bayesian/Frequentist Divide

The Journal of Machine Learning Research
Estimating the size of neural networks from the number of available training data

ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Classification and reductio-ad-absurdum optimality proofs

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Bayesian active learning using arbitrary binary valued queries

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Stochastic refinement

ILP'10 Proceedings of the 20th international conference on Inductive logic programming
A general system for automatic biomedical image segmentation using intensity neighborhoods

Journal of Biomedical Imaging
A hybrid bayesian optimal classifier based on neuro-fuzzy logic

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Bayesian applications of belief networks and multilayer perceptrons for ovarian tumor classification with rejection

Artificial Intelligence in Medicine
A theory of transfer learning with applications to active learning

Machine Learning
Learning theory approach to minimum error entropy criterion

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we study a Bayesian or average-case model of concept learning with a twofold goal: to provide more precise characterizations of learning curve (sample complexity) behavior that depend on properties of both the prior distribution over concepts and the sequence of instances seen by the learner, and to smoothly unite in a common framework the popular statistical physics and VC dimension theories of learning curves. To achieve this, we undertake a systematic investigation and comparison of two fundamental quantities in learning and information theory: the probability of an incorrect prediction for an optimal learning algorithm, and the Shannon information gain. This study leads to a new understanding of the sample complexity of learning in several existing models.