Classification algorithms
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Letter Recognition Using Holland-Style Adaptive Classifiers
Machine Learning
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Artificial Intelligence Review - Special issue on lazy learning
Enhanced nearest neighbour search on the R-tree
ACM SIGMOD Record
Distance browsing in spatial databases
ACM Transactions on Database Systems (TODS)
Data mining: concepts and techniques
Data mining: concepts and techniques
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Artificial Intelligence Review - Special issue on lazy learning
The k-Nearest Neighbour Join: Turbo Charging the KDD Process
Knowledge and Information Systems
INSIGHT: efficient and effective instance selection for time-series classification
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Class imbalance and the curse of minority hubs
Knowledge-Based Systems
Hi-index | 0.00 |
Classification based on k-nearest neighbors (kNN classification) is one of the most widely used classification methods. The number k of nearest neighbors used for achieving a high accuracy in classification is given in advance and is highly dependent on the data set used. If the size of data set is large, the sequential or binary search of NNs is inapplicable due to the increased computational costs. Therefore, indexing schemes are frequently used to speed-up the classification process. If the required number of nearest neighbors is high, the use of an index may not be adequate to achieve high performance. In this paper, we demonstrate that the execution of the nearest neighbor search algorithm can be interrupted if some criteria are satisfied. This way, a decision can be made without the computation of all k nearest neighbors of a new object. Three different heuristics are studied towards enhancing the nearest neighbor algorithm with an early-break capability. These heuristics aim at: (i) reducing computation and I/O costs as much as possible, and (ii) maintaining classification accuracy at a high level. Experimental results based on real-life data sets illustrate the applicability of the proposed method in achieving better performance than existing methods.