iPoc: a polar coordinate based indexing method for nearest neighbor search in high dimensional space

  • Authors:
  • Zhang Liu;Chaokun Wang;Peng Zou;Wei Zheng;Jianmin Wang

  • Affiliations:
  • Department of Computer Science and Technology, Tsinghua University;School of Software, Tsinghua University, Beijing, China and Tsinghua National Laboratory for Information Science and Technology and Key Laboratory for Information System Security, Ministry of Educ ...;School of Software, Tsinghua University, Beijing, China;School of Software, Tsinghua University, Beijing, China;School of Software, Tsinghua University, Beijing, China and Tsinghua National Laboratory for Information Science and Technology and Key Laboratory for Information System Security, Ministry of Educ ...

  • Venue:
  • WAIM'10 Proceedings of the 11th international conference on Web-age information management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

K-nearest neighbor (KNN) search in high dimensional space is essential for database applications, especially multimedia database applications, because images and audio clips are always modeled as high dimensional vectors. However, performance of existing indexing methods degrades dramatically as the dimensionality increases. In this paper, we propose a novel polar coordinate based indexing method, called iPoc, for efficient KNN search in high dimensional space. First, data space is initially partitioned into hypersphere regions, and then each hypersphere is further refined into hypersectors via hyperspherical surface clustering. After that, a series of local polar coordinate systems can be derived from hypersectors, taking advantage of the geometric characters of hypersectors. During search processing, iPoc can effectively prune query-unrelated data points by estimating the lower and upper bounds in both radial coordinate and angle coordinate. Furthermore, we design a key mapping scheme to merge keys measured by independent local polar coordinates into the global polar coordinates. Finally, the global polar coordinates are indexed by a traditional 2-dimensional spatial index, e.g., R-tree. Extensive experiments on real audio datasets and synthetic datasets confirm the effectiveness and efficiency of our proposal and prove that iPoc is more efficient than the existing high dimensional KNN search methods.