iDISQUE: tuning high-dimensional similarity queries in DHT networks

  • Authors:
  • Xiaolong Zhang;Lidan Shou;Kian-Lee Tan;Gang Chen

  • Affiliations:
  • College of Computer Science, Zhejiang University, China;College of Computer Science, Zhejiang University, China;School of Computing, National University of Singapore, Singapore;College of Computer Science, Zhejiang University, China

  • Venue:
  • DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a fully decentralized framework called iDISQUE to support tunable approximate similarity query of high dimensional data in DHT networks. The iDISQUE framework utilizes a distributed indexing scheme to organize data summary structures called iDisques, which describe the cluster information of the data on each peer. The publishing process of iDisques employs a locality-preserving mapping scheme. Approximate similarity queries can be resolved using the distributed index. The accuracy of query results can be tuned both with the publishing and query costs. We employ a multi-probe technique to reduce the index size without compromising the effectiveness of queries. We also propose an effective load-balancing technique based on multi-probing. Experiments on real and synthetic datasets confirm the effectiveness and efficiency of iDISQUE.