Robust boosting and its relation to bagging

Authors:
Saharon Rosset
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY
Venue:
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Year:
2005

Citing 6
Cited 5

Handwritten digit recognition with a back-propagation network

Advances in neural information processing systems 2
Bagging predictors

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
MadaBoost: A Modification of AdaBoost

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Boosting as a Regularized Path to a Maximum Margin Classifier

The Journal of Machine Learning Research

Robust Loss Functions for Boosting

Neural Computation
Model-shared subspace boosting for multi-label classification

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Avoiding Boosting Overfitting by Removing Confusing Samples

ECML '07 Proceedings of the 18th European conference on Machine Learning
ODDboost: Incorporating Posterior Estimates into AdaBoost

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Cry-based classification of healthy and sick infants using adapted boosting mixture learning method for gaussian mixture models

Modelling and Simulation in Engineering

Quantified Score

Hi-index	0.03

Visualization

Abstract

Several authors have suggested viewing boosting as a gradient descent search for a good fit in function space. At each iteration observations are re-weighted using the gradient of the underlying loss function. We present an approach of weight decay for observation weights which is equivalent to "robustifying" the underlying loss function. At the extreme end of decay this approach converges to Bagging, which can be viewed as boosting with a linear underlying loss function. We illustrate the practical usefulness of weight decay for improving prediction performance and present an equivalence between one form of weight decay and "Huberizing" --- a statistical method for making loss functions more robust.