An empirical comparison of three boosting algorithms on real data sets with artificial class noise

Authors:
Ross A. McDonald;David J. Hand;Idris A. Eckley
Affiliations:
Imperial College London;Imperial College London;Shell Research Ltd.
Venue:
MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Year:
2003

Citing 11
Cited 4

A theory of the learnable

Communications of the ACM
The Strength of Weak Learnability

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Boosting a weak learning algorithm by majority

Information and Computation
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
An Adaptive Version of the Boost by Majority Algorithm

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
MadaBoost: A Modification of AdaBoost

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Reducing multiclass to binary: a unifying approach for margin classifiers

The Journal of Machine Learning Research
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

An empirical study on classification methods for alarms from a bug-finding static C analyzer

Information Processing Letters
A boosting approach to remove class label noise

International Journal of Hybrid Intelligent Systems - Hybrid Intelligent systems in Ensembles
Improving boosting by exploiting former assumptions

MCD'07 Proceedings of the 3rd ECML/PKDD international conference on Mining complex data
Feature selection filter for classification of power system operating states

Computers & Mathematics with Applications

Quantified Score

Hi-index	0.01

Visualization

Abstract

Boosting algorithms are a means of building a strong ensemble classifier by aggregating a sequence of weak hypotheses. In this paper we consider three of the best-known boosting algorithms: Adaboost [9], Logitboost [11] and Brownboost [8]. These algorithms are adaptive, and work by maintaining a set of example and class weights which focus the attention of a base learner on the examples that are hardest to classify. We conduct an empirical study to compare the performance of these algorithms, measured in terms of overall test error rate, on five real data sets. The tests consist of a series of cross-validatory samples. At each validation, we set aside one third of the data chosen at random as a test set, and fit the boosting algorithm to the remaining two thirds, using binary stumps as a base learner. At each stage we record the final training and test error rates, and report the average errors within a 95% confidence interval. We then add artificial class noise to our data sets by randomly reassigning 20% of class labels, and repeat our experiment. We find that Brownboost and Logitboost prove less likely than Adaboost to overfit in this circumstance.