On the doubt about margin explanation of boosting

Authors:
Wei Gao;Zhi-Hua Zhou
Affiliations:
-;-
Venue:
Artificial Intelligence
Year:
2013

Citing 21
Cited 0

Occam's razor

Information Processing Letters
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Boosting in the limit: maximizing the margin of learned ensembles

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Prediction games and arcing algorithms

Neural Computation
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Soft Margins for AdaBoost

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Generalization Performance of Classifiers in Terms of Observed Covering Numbers

EuroCOLT '99 Proceedings of the 4th European Conference on Computational Learning Theory
Data-dependent margin-based generalization bounds for classification

The Journal of Machine Learning Research
An empirical comparison of supervised learning algorithms

ICML '06 Proceedings of the 23rd international conference on Machine learning
How boosting the margin can also boost classifier complexity

ICML '06 Proceedings of the 23rd international conference on Machine learning
Concentration inequalities for functions of independent variables

Random Structures & Algorithms
Some Theory for Generalized Boosting Algorithms

The Journal of Machine Learning Research
AdaBoost is Consistent

The Journal of Machine Learning Research
Evidence Contrary to the Statistical View of Boosting

The Journal of Machine Learning Research
Exploration-exploitation tradeoff using variance estimates in multi-armed bandits

Theoretical Computer Science
The Top Ten Algorithms in Data Mining

The Top Ten Algorithms in Data Mining
Boosting through optimization of margin distributions

IEEE Transactions on Neural Networks
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
A Refined Margin Analysis for Boosting Algorithms via Equilibrium Margin

The Journal of Machine Learning Research
Ensemble Methods: Foundations and Algorithms

Ensemble Methods: Foundations and Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Margin theory provides one of the most popular explanations to the success of AdaBoost, where the central point lies in the recognition that margin is the key for characterizing the performance of AdaBoost. This theory has been very influential, e.g., it has been used to argue that AdaBoost usually does not overfit since it tends to enlarge the margin even after the training error reaches zero. Previously the minimum margin bound was established for AdaBoost, however, Breiman (1999) [9] pointed out that maximizing the minimum margin does not necessarily lead to a better generalization. Later, Reyzin and Schapire (2006) [37] emphasized that the margin distribution rather than minimum margin is crucial to the performance of AdaBoost. In this paper, we first present the kth margin bound and further study on its relationship to previous work such as the minimum margin bound and Emargin bound. Then, we improve the previous empirical Bernstein bounds (Audibert et al. 2009; Maurer and Pontil, 2009) [2,30], and based on such findings, we defend the margin-based explanation against Breiman@?s doubts by proving a new generalization error bound that considers exactly the same factors as Schapire et al. (1998) [39] but is sharper than Breiman@?s (1999) [9] minimum margin bound. By incorporating factors such as average margin and variance, we present a generalization error bound that is heavily related to the whole margin distribution. We also provide margin distribution bounds for generalization error of voting classifiers in finite VC-dimension space.