Maximal metric margin partitioning for similarity search indexes

  • Authors:
  • Hisashi Kurasawa;Daiji Fukagawa;Atsuhiro Takasu;Jun Adachi

  • Affiliations:
  • The University of Tokyo, Tokyo, Japan;National Institute of Informatics, Tokyo, Japan;National Institute of Informatics, Tokyo, Japan;National Institute of Informatics, Tokyo, Japan

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a partitioning scheme for similarity search indexes that is called Maximal Metric Margin Partitioning (MMMP). MMMP divides the data on the basis of its distribution pattern, especially for the boundaries of clusters. A partitioning surface created by MMMP is likely to be at maximum distances from the two cluster boundaries. MMMP is the first similarity search index approach to focus on partitioning surfaces and data distribution patterns. We also present an indexing scheme, named the MMMP-Index, which uses MMMP and small ball partitioning. The MMMP-Index prunes many objects that are not relevant to a query, and it reduces the query execution cost. Our experimental results show that MMMP effectively indexes clustered data and reduces the search cost. For clustered vector data, the MMMP-Index reduces the computational cost to less than two thirds that of comparable schemes.