Communications of the ACM
Computational limitations on learning from examples
Journal of the ACM (JACM)
Learnability and the Vapnik-Chervonenkis dimension
Journal of the ACM (JACM)
A guided tour of Chernoff bounds
Information Processing Letters
Investigating the distribution assumptions in the Pac learning model
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
On learning binary weights for majority functions
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Handbook of Mathematical Functions, With Formulas, Graphs, and Mathematical Tables,
Average case analysis of the clipped Hebb rule for nonoverlapping perception networks
COLT '93 Proceedings of the sixth annual conference on Computational learning theory
Hi-index | 0.00 |
We present an algorithm that PAC learns any perceptron with binary weights and arbitrary threshold under the family of product distributions. The sample complexity of this algorithm is of O[(n/ε)4 ln(n/δ)] and its running time increases only linearly with the number of training examples. The algorithm does not try to find an hypothesis that agrees with all of the training examples; rather, it constructs a binary perceptron based on various probabilistic estimates obtained from the training examples. We show that, under the restricted case of the uniform distribution and zero threshold, the algorithm reduces to the well known clipped Hebb rule. We calculate exactly the average generalization rate (i.e., the learning curve) of the algorithm, under the uniform distribution, in the limit of an infinite number of dimensions. We find that the error rate decreases exponentially as a function of the number of training examples. Hence, the average case analysis gives a sample complexity of O[n ln(1/ε)], a large improvement over the PAC learning analysis. The analytical expression of the learning curve is in excellent agreement with the extensive numerical simulations. In addition, the algorithm is very robust with respect to classification noise.