OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Proceedings of the 2002 ACM symposium on Applied computing
Efficient k-NN search on vertically decomposed data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
k-nearest Neighbor Classification on Spatial Data Streams Using P-trees
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Parameter optimized, vertical, nearest-neighbor-vote and boundary-based classification
ACM SIGKDD Explorations Newsletter
Extensions of the k Nearest Neighbour methods for classification problems
AIA '08 Proceedings of the 26th IASTED International Conference on Artificial Intelligence and Applications
Hi-index | 0.00 |
K-nearest neighbors (KNN) is the simplest method for classification. Given a set of objects in a multi-dimensional feature space, the method assigns a category to an unclassified object based on the plurality of category of the k-nearest neighbors. The closeness between objects is determined using a distance measure, e.g. Euclidian distance. Despite its simplicity, KNN also has some drawbacks: 1) it suffers from expensive computational cost in training when the training set contains millions of objects; 2) its classification time is linear to the size of the training set. The larger the training set, the longer it takes to search for the k-nearest neighbors. In this paper, we propose a new algorithm, called SMART-TV (Small Absolute difference of Total Variation), that approximates a set of potential candidates of nearest neighbors by examining the absolute difference of total variation between each data object in the training set and the unclassified object. Then, the k-nearest neighbors are searched from that candidate set. We empirically evaluate the performance of our algorithm on both real and synthetic datasets and find that SMART-TV is fast and scalable. The classification accuracy of SMART-TV is high and comparable to the accuracy of the traditional KNN algorithm.