Class Noise Mitigation Through Instance Weighting

  • Authors:
  • Umaa Rebbapragada;Carla E. Brodley

  • Affiliations:
  • Dept. of Computer Science, Tufts University, 161 College Ave., Medford, MA 02155, USA;Dept. of Computer Science, Tufts University, 161 College Ave., Medford, MA 02155, USA

  • Venue:
  • ECML '07 Proceedings of the 18th European conference on Machine Learning
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a novel framework for class noise mitigation that assigns a vector of class membership probabilities to each training instance, and uses the confidence on the current label as a weight during training. The probability vector should be calculated such that clean instances have a high confidence on its current label, while mislabeled instances have a low confidence on its current label and a high confidence on its correct label. Past research focuses on techniques that either discard or correct instances. This paper proposes that discarding and correcting are special cases of instance weighting, and thus, part of this framework. We propose a method that uses clustering to calculate a probability distribution over the class labels for each instance. We demonstrate that our method improves classifier accuracy over the original training set. We also demonstrate that instance weighting can outperform discarding.