A randomized sphere cover classifier

  • Authors:
  • Reda Younsi;Anthony Bagnall

  • Affiliations:
  • School of Computing Sciences, University of East Anglia, Norwich, UK;School of Computing Sciences, University of East Anglia, Norwich, UK

  • Venue:
  • IDEAL'10 Proceedings of the 11th international conference on Intelligent data engineering and automated learning
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes an instance based classifier, the randomised sphere covering classifier (αRSC), that reduces the training data set size without loss of accuracy when compared to nearest neighbour classifiers. The motivation for developing this algorithm is the desire to have a non-deterministic, fast, instance based classifier that performs well in isolation but is also ideal for use with ensembles. Essentially we trade off decreased testing time for increased training time whilst retaining the simple and intuitive nature of instance based classifiers. We use twenty four benchmark datasets from UCI repository for evaluation. The first set of experiments demonstrate the basic benefits of sphere covering. We show that there is no significant difference in accuracy between the basic αRSC algorithm and a nearest neighbour classifier, even though αRSC compresses the data by up to 75%. We then describe a pruning algorithm that removes spheres that contain α or fewer training instances. The second set of experiments demonstrate that when we set the α parameter through cross validation, the resulting αRSC algorithm outperforms several well known classifiers when compared using the Friedman rank sum test. Thirdly, we highlight the benefits of pruning with a bias/variance decomposition. Finally, we discuss why the randomisation inherent in αRSC makes them an ideal ensemble component and outline our future direction.