A sample set condensation algorithm for the class sensitive artificial neural network
Pattern Recognition Letters
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Fast k-Nearest Neighbor Classification Using Cluster-Based Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Clustering-Based Reference Set Reduction for k-Nearest Neighbor
ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study
IEEE Transactions on Pattern Analysis and Machine Intelligence
Towards efficient imputation by nearest-neighbors: a clustering-based approach
AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Hi-index | 0.00 |
A well known classification method is the k-Nearest Neighbors (k-NN) classifier. However, sequentially searching for the nearest neighbors in large datasets downgrades its performance because of the high computational cost involved. This paper proposes a cluster-based classification model for speeding up the k-NN classifier. The model aims to reduce the cost as much as possible and to maintain the classification accuracy at a high level. It consists of a simple data structure and a hybrid, adaptive algorithm that accesses this structure. Initially, a preprocessing clustering procedure builds the data structure. Then, the proposed algorithm, based on user-defined acceptance criteria, attempts to classify an incoming item using the nearest cluster centroids. Upon failure, the incoming item is classified by searching for the k nearest neighbors within specific clusters. The proposed approach was tested on five real life datasets. The results show that it can be used either to achieve a high accuracy with gains in cost or to reduce the cost at a minimum level with slightly lower accuracy.