Adaptive k-nearest-neighbor classification using a dynamic number of nearest neighbors

Authors:
Stefanos Ougiaroglou;Alexandros Nanopoulos;Apostolos N. Papadopoulos;Yannis Manolopoulos;Tatjana Welzer-Druzovec
Affiliations:
Department of Informatics, Aristotle University, Thessaloniki, Greece;Department of Informatics, Aristotle University, Thessaloniki, Greece;Department of Informatics, Aristotle University, Thessaloniki, Greece;Department of Informatics, Aristotle University, Thessaloniki, Greece;Faculty of Electrical Eng. and Computer Science, University of Maribor, Slovenia
Venue:
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Year:
2007

Citing 11
Cited 2

Classification algorithms

Classification algorithms
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Letter Recognition Using Holland-Style Adaptive Classifiers

Machine Learning
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Locally Weighted Learning

Artificial Intelligence Review - Special issue on lazy learning
Enhanced nearest neighbour search on the R-tree

ACM SIGMOD Record
Distance browsing in spatial databases

ACM Transactions on Database Systems (TODS)
Data mining: concepts and techniques

Data mining: concepts and techniques
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Editorial

Artificial Intelligence Review - Special issue on lazy learning
The k-Nearest Neighbour Join: Turbo Charging the KDD Process

Knowledge and Information Systems

INSIGHT: efficient and effective instance selection for time-series classification

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Class imbalance and the curse of minority hubs

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classification based on k-nearest neighbors (kNN classification) is one of the most widely used classification methods. The number k of nearest neighbors used for achieving a high accuracy in classification is given in advance and is highly dependent on the data set used. If the size of data set is large, the sequential or binary search of NNs is inapplicable due to the increased computational costs. Therefore, indexing schemes are frequently used to speed-up the classification process. If the required number of nearest neighbors is high, the use of an index may not be adequate to achieve high performance. In this paper, we demonstrate that the execution of the nearest neighbor search algorithm can be interrupted if some criteria are satisfied. This way, a decision can be made without the computation of all k nearest neighbors of a new object. Three different heuristics are studied towards enhancing the nearest neighbor algorithm with an early-break capability. These heuristics aim at: (i) reducing computation and I/O costs as much as possible, and (ii) maintaining classification accuracy at a high level. Experimental results based on real-life data sets illustrate the applicability of the proposed method in achieving better performance than existing methods.