Boosting with Noisy Data: Some Views from Statistical Theory

Authors:
Wenxin Jiang
Affiliations:
Department of Statistics, Northwestern University Evanston, IL 60208, U.S.A.
Venue:
Neural Computation
Year:
2004

Citing 13
Cited 0

Boosting a weak learning algorithm by majority

Information and Computation
Bagging predictors

Machine Learning
Game theory, on-line prediction and boosting

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Boosting in the limit: maximizing the margin of learned ensembles

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Boosting regression estimators

Neural Computation
On the Existence of Linear Weak Learners and Applications to Boosting

Machine Learning
Stochastic gradient boosting

Computational Statistics & Data Analysis - Nonlinear methods and data mining
The Consistency of Greedy Algorithms for Classification

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
A Consistent Strategy for Boosting Algorithms

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Boosting Noisy Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
On the rate of convergence of regularized boosting classifiers

The Journal of Machine Learning Research
Minimax nonparametric classification .I. Rates of convergence

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

This letter is a comprehensive account of some recent findings about AdaBoost in the presence of noisy data when approached from the perspective of statistical theory. We start from the basic assumption of weak hypotheses used in AdaBoost and study its validity and implications on generalization error. We recommend studying the generalization error and comparing it to the optimal Bayes error when data are noisy. Analytic examples are provided to show that running the unmodified AdaBoost forever will lead to overfit. On the other hand, there exist regularized versions of AdaBoost that are consistent, in the sense that the resulting prediction will approximately attain the optimal performance in the limit of large training samples.