K-Nearest Neighbor Finding Using MaxNearestDist

Authors:
Hanan Samet
Affiliations:
-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2008

Citing 0
Cited 8

Context Matching for Realizing Cognitive Wireless Network Segments

Wireless Personal Communications: An International Journal
Instance selection for class imbalanced problems by means of selecting instances more than once

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Coarse to fine K nearest neighbor classifier

Pattern Recognition Letters
GeoWhiz: toponym resolution using common categories

Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
SAC: semantic adaptive caching for spatial mobile applications

Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Structured toponym resolution using combined hierarchical place categories

Proceedings of the 7th Workshop on Geographic Information Retrieval
PhotoStand: a map query interface for a database of news photos

Proceedings of the VLDB Endowment
Seeder finder: identifying additional needles in the Twitter haystack

Proceedings of the 6th ACM SIGSPATIAL International Workshop on Location-Based Social Networks

Quantified Score

Hi-index	0.14

Visualization

Abstract

Similarity searching often reduces to finding the k nearest neighbors to a query object. Finding the k nearest neighbors is achieved by applying either a depth- first or a best-first algorithm to the search hierarchy containing the data. These algorithms are generally applicable to any index based on hierarchical clustering. The idea is that the data is partitioned into clusters which are aggregated to form other clusters, with the total aggregation being represented as a tree. These algorithms have traditionally used a lower bound corresponding to the minimum distance at which a nearest neighbor can be found (termed MinDist) to prune the search process by avoiding the processing of some of the clusters as well as individual objects when they can be shown to be farther from the query object q than all of the current k nearest neighbors of q. An alternative pruning technique that uses an upper bound corresponding to the maximum possible distance at which a nearest neighbor is guaranteed to be found (termed MaxNearestDist) is described. The MaxNearestDist upper bound is adapted to enable its use for finding the k nearest neighbors instead of just the nearest neighbor (i.e., k=1) as in its previous uses. Both the depth-first and best-first k-nearest neighbor algorithms are modified to use MaxNearestDist, which is shown to enhance both algorithms by overcoming their shortcomings. In particular, for the depth-first algorithm, the number of clusters in the search hierarchy that must be examined is not increased thereby potentially lowering its execution time, while for the best-first algorithm, the number of clusters in the search hierarchy that must be retained in the priority queue used to control the ordering of processing of the clusters is also not increased, thereby potentially lowering its storage requirements.