A boosting approach to remove class label noise

  • Authors:
  • Amitava Karmaker;Stephen Kwek

  • Affiliations:
  • (Correspd. akarmake@cs.utsa.edu) Department of Computer Science, University of Texas at San Antonio, TX 78249, USA;Department of Computer Science, University of Texas at San Antonio, TX 78249, USA

  • Venue:
  • International Journal of Hybrid Intelligent Systems - Hybrid Intelligent systems in Ensembles
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensemble methods have been known to improve the prediction accuracy over the base learning algorithms. AdaBoost is well-recognized for this in its class. However, it is susceptible to overfitting the training instances corrupted by class label noise. This paper proposes a modification of AdaBoost that is more tolerant to class label noise, which further enhances its ability to boost the prediction accuracy. Particularly, we observe that in Adaboost, the weight-hike of noisy examples can be constrained by careful application of a cut-off in their weights. We study the characteristics of our technique empirically using some artificially generated data set. We also corroborate this on a number of data sets from UCI repository [1]. In both experimental settings, the results obtained affirm the efficiency of our approach. Finally, some of the significant characteristics of our technique related to noisy environments have been investigated.