Decontamination of Training Samples for Supervised Pattern Recognition Methods

  • Authors:
  • Ricardo Barandela;Eduardo Gasca

  • Affiliations:
  • -;-

  • Venue:
  • Proceedings of the Joint IAPR International Workshops on Advances in Pattern Recognition
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The present work discusses what have been called 'imperfectly supervised situations': pattern recognition applications where the assumption of label correctness does not hold for all the elements of the training sample. A methodology for contending with these practical situations and to avoid their negative impact on the performance of supervised methods is presented. This methodology can be regarded as a cleaning process removing some suspicious instances of the training sample or correcting the class labels of some others while retaining them. It has been conceived for doing classification with the Nearest Neighbor rule, a supervised nonparametric classifier that combines conceptual simplicity and an asymptotic error rate bounded in terms of the optimal Bayes error. However, initial experiments concerning the learning phase of a Multilayer Perceptron (not reported in the present work) seem to indicate a broader applicability. Results with both simulated and real data sets are presented to support the methodology and to clarify the ideas behind it. Related works are briefly reviewed and some issues deserving further research are also exposed.