Principal Component Hashing: An Accelerated Approximate Nearest Neighbor Search

  • Authors:
  • Yusuke Matsushita;Toshikazu Wada

  • Affiliations:
  • Graduate School of Systems Engineering, Wakayama University, Wakayama, Japan 640-8510;Graduate School of Systems Engineering, Wakayama University, Wakayama, Japan 640-8510

  • Venue:
  • PSIVT '09 Proceedings of the 3rd Pacific Rim Symposium on Advances in Image and Video Technology
  • Year:
  • 2009

Quantified Score

Hi-index 0.02

Visualization

Abstract

Nearest Neighbor (NN) search is a basic algorithm for data mining and machine learning applications. However, its acceleration in high dimensional space is a difficult problem. For solving this problem, approximate NN search algorithms have been investigated. Especially, LSH is getting highlighted recently, because it has a clear relationship between relative error ratio and the computational complexity. However, the p-stable LSH computes hash values independent of the data distributions, and hence, sometimes the search fails or consumes considerably long time. For solving this problem, we propose Principal Component Hashing (PCH), which exploits the distribution of the stored data. Through experiments, we confirmed that PCH is faster than ANN and LSH at the same accuracy.