A class boundary preserving algorithm for data condensation

  • Authors:
  • K. Nikolaidis;J. Y. Goulermas;Q. H. Wu

  • Affiliations:
  • Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK;Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK;Department of Electrical Engineering and Electronics, University of Liverpool, Liverpool L69 3GJ, UK

  • Venue:
  • Pattern Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

In instance-based machine learning, algorithms often suffer from storing large numbers of training instances. This results in large computer memory usage, long response time, and often oversensitivity to noise. In order to overcome such problems, various instance reduction algorithms have been developed to remove noisy and surplus instances. This paper discusses existing algorithms in the field of instance selection and abstraction, and introduces a new approach, the Class Boundary Preserving Algorithm (CBP), which is a multi-stage method for pruning the training set, based on a simple but very effective heuristic for instance removal. CBP is tested with a large number of datasets and comparatively evaluated against eight of the most successful instance-based condensation algorithms. Experiments showed that our algorithm achieved similar classification accuracies, with much improved storage reduction and competitive execution speeds.