Identifying and Eliminating Irrelevant Instances Using Information Theory

  • Authors:
  • Marc Sebban;Richard Nock

  • Affiliations:
  • -;-

  • Venue:
  • AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

While classical approaches deal with prototype selection (PS) using accuracy maximization, we investigate PS in this paper as an information preserving problem. We use information theory to build a statistical criterion from the nearest-neighbor topology. This statistical framework is used in a backward prototype selection algorithm (PSRCG). It consists in identifying and eliminating uninformative instances, and then reducing the global uncertainty of the learning set. We draw from experimental results and rigorous comparisons two main conclusions: (i) our approach provides a good compromise solution based on the requirement to keep a small number of prototypes, while not compromising the classification accuracy; (ii) our PSRCG algorithm seems to be robust in the presence of noise. Performances on several benchmarks tend to show the relevance and the effectiveness of our method in comparison with the classic PS algorithms based on the accuracy.