Sample-efficient strategies for learning in the presence of noise

Authors:
Nicolò Cesa-Bianchi;Eli Dichterman;Paul Fischer;Eli Shamir;Hans Ulrich Simon
Affiliations:
Univ. of Milan, Milan, Italy;IBM Haifa Research Lab, Haifa, Israel;Univ. of Dortmund, Dortmund, Germany;Hebrew Univ., Jerusalem, Israel;Ruhr-Univ., Bochum, Germany
Venue:
Journal of the ACM (JACM)
Year:
1999

Citing 10
Cited 4

A theory of the learnable

Communications of the ACM
Learning from good and bad data

Learning from good and bad data
A general lower bound on the number of examples needed for learning

Information and Computation
Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
Learning in the presence of malicious errors

SIAM Journal on Computing
Efficient distribution-free learning of probabilistic concepts

Journal of Computer and System Sciences - Special issue: 31st IEEE conference on foundations of computer science, Oct. 22–24, 1990
Toward Efficient Agnostic Learning

Machine Learning - Special issue on computational learning theory, COLT'92
General bounds on the number of examples needed for learning probabilistic concepts

Journal of Computer and System Sciences
Improved lower bounds for learning from noisy examples: an information-theoretic approach

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

PAC learning with nasty noise

Theoretical Computer Science
Smooth Boosting and Learning with Malicious Noise

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
Smooth boosting and learning with malicious noise

The Journal of Machine Learning Research
Purifying data by machine learning with certainty levels

Proceedings of the Third International Workshop on Reliability, Availability, and Security

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper, we prove various results about PAC learning in the presence of malicious noise. Our main interest is the sample size behavior of learning algorithms. We prove the first nontrivial sample complexity lower bound in this model by showing that order of &egr;/&Dgr;2 + d/&Dgr; (up to logarithmic factors) examples are necessary for PAC learning any target class of {0,1}-valued functions of VC dimension d, where &egr; is the desired accuracy and &eegr; = &egr;/(1 + &egr;) - &Dgr; the malicious noise rate (it is well known that any nontrivial target class cannot be PAC learned with accuracy &egr; and malicious noise rate &eegr; ≥ &egr;/(1 + &egr;), this irrespective to sample complexity). We also show that this result cannot be significantly improved in general by presenting efficient learning algorithms for the class of all subsets of d elements and the class of unions of at most d intervals on the real line. This is especialy interesting as we can also show that the popular minimum disagreement strategy needs samples of size d &egr;/&Dgr;2, hence is not optimal with respect to sample size. We then discuss the use of randomized hypotheses. For these the bound &egr;/(1 + &egr;) on the noise rate is no longer true and is replaced by 2&egr;/(1 + 2&egr;). In fact, we present a generic algorithm using randomized hypotheses that can tolerate noise rates slightly larger than &egr;/(1 + &egr;) while using samples of size d/&egr; as in the noise-free case. Again one observes a quadratic powerlaw (in this case d&egr;/&Dgr;2, &Dgr; = 2&egr;/(1 + 2&egr;) - &eegr;) as &Dgr; goes to zero. We show upper and lower bounds of this order.