Pivot selection method for optimizing both pruning and balancing in metric space indexes

  • Authors:
  • Hisashi Kurasawa;Daiji Fukagawa;Atsuhiro Takasu;Jun Adachi

  • Affiliations:
  • The University of Tokyo, Chiyoda-ku, Tokyo, Japan;Doshisha University, Kyotanabe-shi, Kyoto, Japan;National Institute of Informatics, Chiyoda-ku, Tokyo, Japan;National Institute of Informatics, Chiyoda-ku, Tokyo, Japan

  • Venue:
  • DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We researched to try to find a way to reduce the cost of nearest neighbor searches in metric spaces. Many similarity search indexes recursively divide a region into subregions by using pivots, and construct a tree structure index. A problem in the existing indexes is that they only focus on the pruning objects and do not take into consideration the tree balancing. The balance of the indexes depends on the data distribution and the indexes don't reduce the search cost for all data. We propose a similarity search index called the Partitioning Capacity Tree (PCTree). PCTree automatically optimizes the pivot selection based on both the balance of the regions partitioned by a pivot and the estimated effectiveness of the search pruning by the pivot. As a result, PCTree reduces the search cost for various data distributions. Our evaluations comparing it with four indexes on three real datasets showed that PCTree successfully reduces the search cost and is good at handling various data distributions.