Communications of the ACM
Automatic Pattern Recognition: A Study of the Probability of Error
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM (JACM)
Learnability by fixed distributions
COLT '88 Proceedings of the first annual workshop on Computational learning theory
Simultaneous learning of concepts and simultaneous estimation of probabilities
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
DECISION THEORETIC GENERALIZATIONS OF THE PAC MODEL FORNEURAL NET AND OTHER LEARNING APPLICATIONS
DECISION THEORETIC GENERALIZATIONS OF THE PAC MODEL FORNEURAL NET AND OTHER LEARNING APPLICATIONS
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
On Agnostic Learning with {0, *, 1}-Valued and Real-Valued Hypotheses
COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Hi-index | 0.00 |
To learn, it suffices to estimate the error of all candidate hypotheses simultaneously. We study the problem of when this “simultaneous estimation” is possible and show that it leads to new learning procedures and weaker sufficient conditions for a broad class of learning problems. We modify the standard Probably Approximately Correct (PAC) setup to allow concepts that are “stochastic functions.” A deterministic function maps a set X into a set Y, whereas a stochastic function is a probability distribution on X x Y. We approach the simultaenous estimation problem by concentrating on a subset of all estimators, those that satisfy a natural “smoothness” constraint. The common empirical estimator falls within this class. We show that smooth simultaneous estimability can be characterized by a sampling-based criterion. Also, we describe a canonical estimator for this class of problems. This canonical estimator has a unique form: it uses part of the samples to select a finite subset of hypotheses that approximates the class of candidate hypotheses, and then it uses the rest of the samples to estimate the error of each hypothesis in the subset. Finally, we show that a learning procedure based on the canonical estimator will work in every case where empirical error minimization does.