A unified approach for computing top-k pairs in multidimensional space

  • Authors:
  • Muhammad Aamir Cheema;Xuemin Lin;Haixun Wang;Jianmin Wang;Wenjie Zhang

  • Affiliations:
  • The University of New South Wales, Australia;The University of New South Wales, Australia;Microsoft Research Asia, China;School of Software, Tsinghua University, Tsinghua National Laboratory for Information Science and Technology, China;The University of New South Wales, Australia

  • Venue:
  • ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Top-k pairs queries have many real applications. k closest pairs queries, k furthest pairs queries and their bichromatic variants are some of the examples of the top-k pairs queries that rank the pairs on distance functions. While these queries have received significant research attention, there does not exist a unified approach that can efficiently answer all these queries. Moreover, there is no existing work that supports top-k pairs queries based on generic scoring functions. In this paper, we present a unified approach that supports a broad class of top-k pairs queries including the queries mentioned above. Our proposed approach allows the users to define a local scoring function for each attribute involved in the query and a global scoring function that computes the final score of each pair by combining its scores on different attributes. We propose efficient internal and external memory algorithms and our theoretical analysis shows that the expected performance of the algorithms is optimal when two or less attributes are involved. Our approach does not require any pre-built indexes, is easy to implement and has low memory requirement. We conduct extensive experiments to demonstrate the efficiency of our proposed approach.