Data characterization for effective prototype selection

  • Authors:
  • Ramón A. Mollineda;J. Salvador Sánchez;José M. Sotoca

  • Affiliations:
  • Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain;Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain;Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain

  • Venue:
  • IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Nearest Neighbor classifier is one of the most popular supervised classification methods. It is very simple, intuitive and accurate in a great variety of real-world applications. Despite its simplicity and effectiveness, practical use of this rule has been historically limited due to its high storage requirements and the computational costs involved, as well as the presence of outliers. In order to overcome these drawbacks, it is possible to employ a suitable prototype selection scheme, as a way of storage and computing time reduction and it usually provides some increase in classification accuracy. Nevertheless, in some practical cases prototype selection may even produce a degradation of the classifier effectiveness. From an empirical point of view, it is still difficult to know a priori when this method will provide an appropriate behavior. The present paper tries to predict how appropriate a prototype selection algorithm will result when applied to a particular problem, by characterizing data with a set of complexity measures.