An adaptive hybrid and cluster-based model for speeding up the k-NN classifier

Authors:
Stefanos Ougiaroglou;Georgios Evangelidis;Dimitris A. Dervos
Affiliations:
Dept. of Applied Informatics, University of Macedonia, Thessaloniki, Greece;Dept. of Applied Informatics, University of Macedonia, Thessaloniki, Greece;Dept. of Informatics, Alexander TEI of Thessaloniki, Sindos, Greece
Venue:
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Year:
2012

Citing 9
Cited 0

A sample set condensation algorithm for the class sensitive artificial neural network

Pattern Recognition Letters
Reduction Techniques for Instance-BasedLearning Algorithms

Machine Learning
Learning Symbolic Prototypes

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Fast k-Nearest Neighbor Classification Using Cluster-Based Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Clustering-Based Reference Set Reduction for k-Nearest Neighbor

ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study

IEEE Transactions on Pattern Analysis and Machine Intelligence
Towards efficient imputation by nearest-neighbors: a clustering-based approach

AI'04 Proceedings of the 17th Australian joint conference on Advances in Artificial Intelligence
A Taxonomy and Experimental Study on Prototype Generation for Nearest Neighbor Classification

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Quantified Score

Hi-index	0.00

Visualization

Abstract

A well known classification method is the k-Nearest Neighbors (k-NN) classifier. However, sequentially searching for the nearest neighbors in large datasets downgrades its performance because of the high computational cost involved. This paper proposes a cluster-based classification model for speeding up the k-NN classifier. The model aims to reduce the cost as much as possible and to maintain the classification accuracy at a high level. It consists of a simple data structure and a hybrid, adaptive algorithm that accesses this structure. Initially, a preprocessing clustering procedure builds the data structure. Then, the proposed algorithm, based on user-defined acceptance criteria, attempts to classify an incoming item using the nearest cluster centroids. Upon failure, the incoming item is classified by searching for the k nearest neighbors within specific clusters. The proposed approach was tested on five real life datasets. The results show that it can be used either to achieve a high accuracy with gains in cost or to reduce the cost at a minimum level with slightly lower accuracy.