A Boosting Approach to remove Class Label Noise

  • Authors:
  • Amitava Karmaker;Stephen Kwek

  • Affiliations:
  • University of Texas at San Antonio;University of Texas at San Antonio,

  • Venue:
  • HIS '05 Proceedings of the Fifth International Conference on Hybrid Intelligent Systems
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ensemble methods have been known to improve prediction accuracy over the base learning algorithm. AdaBoost is well-recognized for that in its class. However, it is susceptible to overfitting the training instances corrupted by class label noise. This paper proposes a modification to AdaBoost that is more tolerant to class label noise, which further enhances its ability to boost prediction accuracy. In particular, we observe that in Adaboost, the weight-hike of noisy examples can be constrained by careful application of a cut-off in their weights. Effectiveness of our algorithm is demonstrated empirically using some artificially generated data. We also corroborate this on a number of data sets from UCI repository [1]. In both experimental settings, the results obtained affirm the efficacy of our approach. Finally, some of the significant characteristics of our technique related to noisy environments have been investigated.