A primal-dual convergence analysis of boosting

Authors:
Matus Telgarsky
Affiliations:
Department of Computer Science and Engineering, University of California, San Diego, San Diego, CA
Venue:
The Journal of Machine Learning Research
Year:
2012

Citing 14
Cited 0

The Strength of Weak Learnability

Machine Learning
On the convergence of the coordinate descent method for convex differentiable minimization

Journal of Optimization Theory and Applications
Boosting a weak learning algorithm by majority

Information and Computation
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Boosting as entropy projection

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Prediction games and arcing algorithms

Neural Computation
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Logistic Regression, AdaBoost and Bregman Distances

Machine Learning
Boosting the margin: A new explanation for the effectiveness of voting methods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Maximizing the Margin with Boosting

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Hard-core distributions for somewhat hard problems

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Convex Optimization

Convex Optimization
Some Theory for Generalized Boosting Algorithms

The Journal of Machine Learning Research
Boosting: Foundations and Algorithms

Boosting: Foundations and Algorithms

Quantified Score

Hi-index	0.00

Visualization

Abstract

Boosting combines weak learners into a predictor with low empirical risk. Its dual constructs a high entropy distribution upon which weak learners and training labels are uncorrelated. This manuscript studies this primal-dual relationship under a broad family of losses, including the exponential loss of AdaBoost and the logistic loss, revealing: • Weak learnability aids the whole loss family: for any ε 0, O(ln(1/ε)) iterations suffice to produce a predictor with empirical risk ε-close to the infimum; • The circumstances granting the existence of an empirical risk minimizer may be characterized in terms of the primal and dual problems, yielding a new proof of the known rate O(ln(1/ε)); • Arbitrary instances may be decomposed into the above two, granting rate O(1/ε), with a matching lower bound provided for the logistic loss.