Using Representative-Based Clustering for Nearest Neighbor Dataset Editing

Authors:
Christoph F. Eick;Nidal Zeidat;Ricardo Vilalta
Affiliations:
University of Houston;University of Houston;University of Houston
Venue:
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Year:
2004

Citing 0
Cited 4

Supervised evaluation of Voronoi partitions

Intelligent Data Analysis
A novel gray-based reduced NN classification method

Pattern Recognition
Clustering-Based Reference Set Reduction for k-Nearest Neighbor

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
A simple noise-tolerant abstraction algorithm for fast k-NN classification

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The goal of dataset editing in instance-based learning is to remove objects from a training set in order to increase the accuracy of a classifier. For example, Wilson editing removes training examples that are misclassified by a nearest neighbor classifier so as to smooth the shape of the resulting decision boundaries. This paper revolves around the use of representative-based clustering algorithms for nearest neighbor dataset editing. We term this approach supervised clustering editing. The main idea is to replace a dataset by a set of cluster prototypes. A novel clustering approach called supervised clustering is introduced for this purpose. Our empirical evaluation using eight UCI datasets shows that both Wilson and supervised clustering editing improve accuracy on more than 50% of the datasets tested. However, supervised clustering editing achieves four times higher compression rates than Wilson editing.