BP-tree: an efficient index for similarity search in high-dimensional metric spaces

  • Authors:
  • Jurandy Almeida;Ricardo da S. Torres;Neucimar J. Leite

  • Affiliations:
  • University of Campinas, Campinas, Brazil;University of Campinas, Campinas, Brazil;University of Campinas, Campinas, Brazil

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Similarity search in high-dimensional metric spaces is a key operation in many applications, such as multimedia databases, image retrieval, object recognition, and others. The high dimensionality of the data requires special index structures to facilitate the search. Most of existing indexes are constructed by partitioning the data set using distance-based criteria. However, those methods either produce disjoint partitions, but ignore the distribution properties of the data; or produce non-disjoint groups, which greatly affect the search performance. In this paper, we study the performance of a new index structure, called Ball-and-Plane tree (BP-tree), which overcomes the above disadvantages. BP-tree is constructed by recursively dividing the data set into compact clusters. Distinctive from other techniques, it integrates the advantages of both disjoint and non-disjoint paradigms in order to achieve a structure of tight and low overlapping clusters, yielding significantly improved performance. Results obtained from an extensive experimental evaluation with real-world data sets show that BP-tree consistently outperforms state-of-the-art solutions.