A general lower bound on the number of examples needed for learning
Information and Computation
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM (JACM)
Learning Nested Differences of Intersection-Closed Concept Classes
Machine Learning
Predicting {0, 1}-functions on randomly drawn points
Information and Computation
Learning nested differences in the presence of malicious noise
Theoretical Computer Science - Special issue on algorithmic learning theory
Approximating hyper-rectangles: learning and pseudorandom sets
Journal of Computer and System Sciences - Fourteenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
On-line learning with malicious noise and the closure algorithm
Annals of Mathematics and Artificial Intelligence
Hi-index | 0.00 |
For hyper-rectangles in $$\mathbb{R}^{d}$$ Auer (1997) proved a PAC bound of $$O(\frac{1}{\varepsilon}(d+\log \frac{1}{\delta}))$$, where $$\varepsilon$$ and $$\delta$$ are the accuracy and confidence parameters. It is still an open question whether one can obtain the same bound for intersection-closed concept classes of VC-dimension $$d$$ in general. We present a step towards a solution of this problem showing on one hand a new PAC bound of $$O(\frac{1}{\varepsilon}(d\log d + \log \frac{1}{\delta}))$$ for arbitrary intersection-closed concept classes, complementing the well-known bounds $$O(\frac{1}{\varepsilon}(\log \frac{1}{\delta}+d\log \frac{1}{\varepsilon}))$$ and $$O(\frac{d}{\varepsilon}\log \frac{1}{\delta})$$ of Blumer et al. and (1989) and Haussler, Littlestone and Warmuth (1994). Our bound is established using the closure algorithm, that generates as its hypothesis the intersection of all concepts that are consistent with the positive training examples. On the other hand, we show that many intersection-closed concept classes including e.g. maximum intersection-closed classes satisfy an additional combinatorial property that allows a proof of the optimal bound of $$O(\frac{1}{\varepsilon}(d+\log \frac{1}{\delta}))$$. For such improved bounds the choice of the learning algorithm is crucial, as there are consistent learning algorithms that need $$\Omega(\frac{1}{\varepsilon}(d\log\frac{1}{\varepsilon} +\log\frac{1}{\delta}))$$ examples to learn some particular maximum intersection-closed concept classes.