Efficient high-dimensional indexing by sorting principal component

  • Authors:
  • Jiangtao Cui;Shuisheng Zhou;Junding Sun

  • Affiliations:
  • School of Computer Science and Technology, Xidian University, Xi'an 710071, China;School of Computer Science and Technology, Xidian University, Xi'an 710071, China;School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454000, China

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2007

Quantified Score

Hi-index 0.10

Visualization

Abstract

The vector approximation file (VA-file) approach is an efficient high-dimensional indexing method for image retrieval in large database. Some extensions of VA-file have been proposed towards better query performance. However, all of these methods applying sequential scan need read the whole vector approximation file. In this paper, we present a new indexing structure based on vector approximation method, in which only a small part of approximation file need be accessed. First, principal component analysis is used to map multidimensional points to a 1D line. Then a B^+-tree is built to index the approximate vector according to principal component. When performing k-nearest neighbor search, the partial distortion searching algorithm is used to reject the improper approximate vectors. Only a small set of approximate vectors need to be sequentially scanned during the search, which can reduce the CPU cost and I/O cost dramatically. Experiment results on large image databases show that the new approach provides a faster search speed than the other VA-file approaches.