A geometric approach to leveraging weak learners

Authors:
Nigel Duffy;David Helmbold
Affiliations:
University of California at Santa Cruz, 1156 High Street, Santa Cruz, CA;University of California at Santa Cruz, 1156 High Street, Santa Cruz, CA
Venue:
Theoretical Computer Science
Year:
2002

Citing 23
Cited 1

A theory of the learnable

Communications of the ACM
What size net gives valid generalization?

Neural Computation
Polynomial learnability of probabilistic concepts with respect to the Kullback-Leibler divergence

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Equivalence of models for polynomial learnability

Information and Computation
A training algorithm for optimal margin classifiers

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The design and analysis of efficient learning algorithms

The design and analysis of efficient learning algorithms
Learning Boolean formulas

Journal of the ACM (JACM)
An introduction to computational learning theory

An introduction to computational learning theory
Boosting a weak learning algorithm by majority

Information and Computation
Bagging predictors

Machine Learning
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
General convergence results for linear discriminant updates

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
An adaptive version of the boost by majority algorithm

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Drifting games

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Additive models, boosting, and inference for generalized divergences

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Boosting as entropy projection

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Prediction games and arcing algorithms

Neural Computation
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Direct optimization of margins improves generalization in combined classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)

Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics)
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Random classification noise defeats all convex potential boosters

Proceedings of the 25th international conference on Machine learning

Quantified Score

Hi-index	5.23

Visualization

Abstract

AdaBoost is a popular and effective leveraging procedure for improving the hypotheses generated by weak learning algorithms. AdaBoost and many other leveraging algorithms can be viewed as performing a constrained gradient descent over a potential function. At each iteration the distribution over the sample given to the weak learner is proportional to the direction of steepest descent. We introduce a new leveraging algorithm based on a natural potential function. For this potential function, the direction of steepest descent can have negative components. Therefore, we provide two techniques for obtaining suitable distributions from these directions of steepest descent. The resulting algorithms have bounds that are incomparable to AdaBoost's. The analysis suggests that our algorithm is likely to perform better than AdaBoost on noisy data and with weak learners returning low confidence hypotheses. Modest experiments confirm that our algorithm can perform better than AdaBoost in these situations.