The Active Vertice method: a performant filtering approach to high-dimensional indexing

  • Authors:
  • Sören Balko;Ingo Schmitt;Gunter Saake

  • Affiliations:
  • Department of Computer Science, University of Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany;Department of Computer Science, University of Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany;Department of Computer Science, University of Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of finding nearest neighbors has emerged as an important foundation of feature-based similarity search in multimedia databases. Most spatial index structures based on the R-tree have failed to efficiently support nearest neighbor search in arbitrarily distributed high-dimensional data sets. In contrast, the so-called filtering principle as represented by the popular VA-file has turned out to be a more promising approach. Query processing is based on a flat file of compact vector approximations. In a first stage, those approximations are sequentially scanned and filtered so that in a second stage the nearest neighbors can be determined from a relatively small fraction of the data set.In this paper, we propose the Active Vertice method as a novel filtering approach. As opposed to the VA-file, approximation regions are arranged in a quad-tree like structure. High-dimensional feature vectors are assigned to ellipsoidal approximation regions on different levels of the tree. A compact approximation of a vector corresponds to the path within the index from the root to the respective tree node. When compared to the VA-file, our method enhances the discriminatory power of the approximations while maintaining their compactness in terms of storage consumption. To demonstrate its effectiveness, we conduct extensive experiments with synthetic as well as real-life data and show the superiority of our method over existing filtering approaches.