Bi-level Locality Sensitive Hashing for k-Nearest Neighbor Computation

  • Authors:
  • Jia Pan;Dinesh Manocha

  • Affiliations:
  • -;-

  • Venue:
  • ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new Bi-level LSH algorithm to perform approximate $k$-nearest neighbor search in high dimensional spaces. Our formulation is based on a two-level scheme. In the first level, we use a RP-tree that divides the dataset into sub-groups with bounded aspect ratios and is used to distinguish well-separated clusters. During the second level, we compute a single LSH hash table for each sub-group along with a hierarchical structure based on space-filling curves. Given a query, we first determine the sub-group that it belongs to and perform $k$-nearest neighbor search within the suitable buckets in the LSH hash table corresponding to the sub-group. Our algorithm also maps well to current GPU architectures and can improve the quality of approximate KNN queries as compared to prior LSH-based algorithms. We highlight its performance on two large, high-dimensional image datasets. Given a runtime budget, Bi-level LSH can provide better accuracy in terms of recall or error ration. Moreover, our formulation reduces the variation in runtime cost or the quality of results.